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ABSTRACT 


The  control  of  dynamic  systems  subject  to  abrupt,  state-dependent 
structural  changes  such  as  component  failures,  at  random  times,  is 
considered.  This  investigation  is  motivated  by  the  need  for  design 
techniques  that  yield  fault-tolerant  systems,  in  the  sense  that 
they  can  perform  satisfactorily  despite  untoward  events.  This  work 
concentrates  on  the  tradeoffs  between  good  performance  and  reliability 
requirements . 

The  approach  used  is  to  formulate  discrete-time  nonlinear 
stochastic  control  problems  that  capture  some  of  the  issues  of 
fault  tolerant  control,  and  to  analyze  the  behavior  of  the 
controllers  obtained  by  solving  these  problems. 

These  problems  are  approached  using  dynamic  programming  methods. 
A  preliminary  result  is  the  derivation  for  discrete-time  noiseless 
problems  with  Markovian  structure,  results  analogous  to  existing 
results  in  continuous  time.  In  addition  necessary  and  sufficient 
conditions  for  the  existence  of  a  steady-state  controller  yielding 
finite  expected  cost  are  obtained. 

This  preliminary  result  is  then  used  to  attack  the  harder 
problems  of  state-dependent  structure  changes.  The  basic  method 
used  is  to  convert  the  state-dependent  problems  into  the  comparison 
of  a  set  of  constrained  (in  the  state)  problems  that  have  state-indep¬ 
endent  transition  probabilities .  First  systems  where  the  structure 
transition  probabilities  depend  upon  the  state  in  a  piecewise — 
constant  way  are  considered.  For  scalar  problems  with  no  input 
noise  an  algorithm  is  obtained  that  determines  the  optimal  controller 
off  line,  in  advance  of  system  operation.  For  problems  with  additional 
structure  this  algorithm  collapses  into  the  simultaneous  solution  of 
a  set  of  coupled  difference  equations  that  are  similar  to  Riccati 
equations . 

Two  examples  of  such  problems  are  considered  in  detail;  one 
involves  performance  and  reliability  goals  that  are  conflicting  and 
in  the  other  case  they  are  commensurate.  Both  cases  are  analyzed  to 
see  how  the  optimal  controller  handles  the  tradeoff  between  these 
goals.  One  controller  action  is  to  drive  the  state  to  the  low  cost 


goals.  Then  additive  input  noise,  more  general  costs  structures  and 
more  general  functional  dependence  of  transition  probabilities  on 
the  state  are  considered.  The  additve  noise  changes  the  problem  in 
a  fundamental  way  since  the  controller  cannot  position  the  state  with 
certainty.  However  an  algorithm  that  yileds  the  optimal  controller 
can  be  obtained  and  qualitative  properties  of  the  controller  can 
be  analyzed. 

Finally  several  extensions  of  these  problems  are  considered. 
Thesis  Supervisor:  Alan  S.  Willsky 

Title:  Associate  Professor  of  Electrical  Engineering  and 
Computer  Science. 
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PART  I 


INTRODUCTION  AND  BACKGROUND 


1.  INTRODUCTION 


This  thesis  considers  the  control  of  dynamic  systems  that 
experience  abrupt,  structural  changes  at  random  times.  These  changes 
au:e  caused  by  phenomena  such  as  component  failures  and  repairs,  and 
large  environmental  disturbances .  - 

This  document  is  divided  into  five  parts.  Part  I  contains 
introductory,  motivational  and  background  material.  It  also  presents 
the  perspective  and  conceptual  basis  of  the  work.  A  different  class  of 
problems  is  considered  in  each  of  the  next  four  parts.  Part  V  closes 
the  thesis  with  a  summary  of  results,  concluding  comments  and  suggestions 
for  further  research. 

1.1  Fault-Tolerant  Systems 

This  thesis  is  motivated  by  the  need  for  design  techniques  that 

yield  automatic  systems  that  cure  fault- tolerant;  that  is,  systems 

which  are  are  able  to  survive  and  adequately  function  despite  the  oc- 

1  2 

currence  of  component  failures  and  other  disruptions.  * 

Some  examples  of  situations  where  there  is  need  for  fault- tolerant 
system  designs  are  when: 


1 

The  term  fault-tolerar.ce  comes  from  digital  computer  design,  where 
fault  refers  to  any  disruption  in  the  specified  behavior  of  a  system. 
For  example,  see  [  5  ]■ 

2 

In  (English  translations  of)  Russian  reliability  theory  literature, 
a  fault-tolerant  system  is  any  system  having  components  that  can  be 
repaired.  For  example,  see  [ 33  ] . 


2 


.  failures  can  jeopardize  human  lives,  such  as  in 

life  support  systems, 

-  medical  prosthetics, 

-  air  traffic  control  systems, 

-  automated  military  systems, 

-  systems  for  handling  hazardous  material, 

-  electric  power  plants  (especially  nuclear) , 

-  aircraft,  manned  spacecraft,  trains,  automobiles, 

elevators  and  other  mechanized  conveyances. 

.  failures  have  high  monetary  costs,  such  as  in 

electric  power  distribution, 

-  automated  manufacturing  processes, 
communications  systems. 

.  repair  or  maintenance  by  humans  is  inadvisable  or 
impossible,  such  as  in 

-  deep-space  vehicles, 

-  deep-water  systems , 

-  systems  operating  in  extreme  temperature, 

radioactive,  biohazardous  or  toxic  environments. 

We  can  identify  three  basic  issues  that  must  be  taken  into  account 
in  the  design  of  fault- tolerant  systems.  They  are 

.  the  type  and  level  of  redundancy  used, 

.  the  effects  of  failure-related  uncertainties, 

and 

.  conflicting  system  performance  and  reliability  goals. 

These  design  issues  will  be  briefly  discussed  here. 


REDUNDANCY 

Engineering  systems  have  traditionally  been  made  reliable 
through  the  use  of  redundant  components,  so  that  individual  failures 
need  not  be  catastrophic  to  the  entire  system  (and  by  the  use  of 
highly  reliable  components  and  assembly  procedures  so  that  failures 
sure  unlikely) .  Redundant  components  acre  used  to  detect  failures  and 
to  compensate  for  them.  There  are  essentially  two  kinds  of 
redundancy  that  can  be  used: 

•  direct  redundancy  -  Multiple  copies  of  the  same  component 

are  used,  in  'voting*  schemes  for  failure  detection 
and  as  'backups'  for  failure  compensation. 

•  functional  redundancy  -  The  system  is  designed  so  that 

components  and  subsystems  have  overlapping 
capabilities. 

FAILURE-  RELATED  UNCERTAINTIES 


Failure  event  uncertainties  that  must  be  addressed  in  fault 
tolerant  system  designs  include: 


.  plant  uncertainties  -  Failure  events  change  the  system 

state  or  dynamics  in  ways  and  at  times  that  are 
not  known  in  advance. 

.  detection  uncertainties  -  The  ability  to  detect,  isolate 

and  estimate  failures  is  usually  imperfect.  The 
possibilities  of  incorrect  failure  detections  and 
decisions  must  be  taken  into  account  in  the  system 
design. 


CONFLICTING  GOALS 


The  goals  of  reliability  and  fault-tolerance  may  conflict  with 
other  system  performance  objectives.  Here  are  three  classes  of 
costs  associated  with  the  attainment  of  fault- tolerance: 

.  Fixed  costs  -  Fault- tolerant  designs  usually  require 
additional  or  different  hardware  that  is  not  needed 
during  fault-free  operation.  This  extra  equipment 
may  involve  not  only  purchase  costs  but  also  degraded 
system  performance  (e.g.,  extra  weight  in  aircraft). 

.  Hedging  costs  -  The  operation  of  a  system  so  that  it 

is  fault-tolerant  may  conflict  with  the  optimal  way 
to  operate  the  system  in  fault-free  circumstances. 

A  cost,  in  terms  of  performance  loss  before  failure, 
is  paid  to  improve  the  expected  performance  when 
failures  occur  or  to  reduce  the  probability  of  failure. 

.  Maintenance  costs  -  Preventive  maintenance  (and  ins¬ 
pection)  results  in  direct  costs  (for  parts,  labor, 
etc.)  as  well  as  performance  losses  while  maintenance 
activities  are  undertaken. 

FOCUS  OP  THIS  WORK 

This  thesis  concentrates  on  the  second  fault-tolerance  issue 
listed  above  -  the  tradeoffs  and  conflicts  between  reliability  goals 
and  system  performance.  Specifically,  we  consider  the  attainment  of 
fault- tolerance  through  control  strategies,  rather  than  by  direct 


We  seek  control  problem  formulations  that  yield  controller 
designs  which  endow  systems  with  fault- tolerance.  An  optimal  fault- 
tolerant  controller  should  utilize  all  system  capabilities  and  take 
into  account  all  known  system  limitations  and  failure  likelihoods,  so 
as  to  achieve  the  best  tradeoff  between  reliability  and  system 
performance.  We  believe  this  to  be  an  important  step  in  the  ongoing 
development  of  theories  and  methods  for  fault- tolerant  system  design. 

1.2  Fault-Tolerant  Control 

Fault- tolerant  control  is  the  use  of  control  strategies  to  make 
failure-prone  systems  responsive  to  untoward  events.  This  requires 
the  'building  in'  of  fault- tolerance,  by  modelling  how  failures  can 
happen  and  what  can  be  done  to  avoid  or  overcome  them.  In  general, 
fault-tolerant  controllers  will  trade  some  degradation  of  performance 
quality  before  failures  occur  for  system  'survival'  afterwards. 

This  may  involve  component  repair,  maintenance,  or  reconfiguration 
of  the  control  system. 

From  an  examination  of  common  engineering  practices  and 
consideration  of  fault- tolerance  needs  of  engineering  systems,  same 
attributes  that  fault-tolerant  controllers  should  possess  can  be 
identified.  We  call  them: 

.  Passive  Hedging 
.  Active  Hedging  (Risk  Reduction) 

.  Adaptability 


Robustness 


.  Implementability . 

These  properties  of  fault- tolerant  controllers  are  discussed  in 
this  section. 

Passive  and  active  hedging  require  the  balancing  of  conflicts 
between  system  performance  and  reliability  goals.  Adaptability 
involves  the  use  of  redundancy,  probabilistic  descriptions  of  failure 
occurrences,  and  the  ability  to  detect  them.  Robustness  and 
implementability  are  necessary  for  successful  operation  of  any  fault- 
tolerant  controller. 

PASSIVE  HEDGING 

This  is  simply  taking  into  account  the  possibility  of  failures 
(and  associated  costs)  in  the  choice  of  control.  For  example,  am 
automobile  driver  speeding  around  a  curve  might  avoid  the  outer 
edge  of  the  road,  so  that  if  a  tire  blowout  occurs  the  system  can  still 
recover.  Passive  hedging  does  not  involve  using  controls  to  affect 
the  probability  of  future  failure  event  occurrences. 

ACTIVE  HEDGING  (RISK  REDUCTION) 

Probabilistic  knowledge  of  failures  may  be  used  to  alter  their 
likelihoods.  Preventive  maintenance  (replacement  before  failure) 
is  an  example  of  this.  If  failure  probabilities  depend  upon  control 
inputs  (directly,  or  indirectly  as  a  function  of  the  system  state) 


then  controls  can  be  used  to  actively  hedge  as  well  as  to  minimize 
operating  costs.  For  example,  voltages  and  currents  in  an  elec¬ 
trical  system  might  be  kept  below  levels  that  cause  components  to 
burn  out. 

ADAPTABILITY 

In  general,  some  kind  of  on-line,  real-time  system  testing 
and  failure  detection  process  must  take  place.  When  a  failure  is 
known  to  have  occurred,  'contingency1  controls  are  used.  The 
primary  system  goal  may  then  become,  for  example 

.  degraded  recovery  -  'graceful  degradation' , 

'fail-soft'  operation 

.  safe  shutdown  -  ('fail-safe'  operation) 

so  as  to  avoid  further  system  damage  by  continued 
operation. 

The  system  must  detect  its  failures  and  reorganize  itself  to  com¬ 
pensate  for  them. 

ROBUSTNESS  AND  IMPLEMENT ABILITY 

Fault-tolerant  controller  designs  should  be  robust  in  the  sense 
that  they  are  insensitive  to  small  disturbances  and  modelling 
inaccuracies.  Fault- tolerant  control  strategies  must  be  implementable 
in  real-time  if  they  are  to  be  useful.  Th.i  restricts  the  complexity 


of  controller  designs. 


The  controller  designs  that  are  obtained  using  any  proposed 


fault- tolerant  control  theory  must  be  evaluated  in  terms  of  these 
five  attributes,  to  determine  if  the  theory  is  meaningful.  The  task 
at  hand  is  to  develop  objective  problem  formulations  that  capture 
these  subjective  fault- tolerance  attributes.  In  particular,  since  we 
are  concerned  here  with  the  balancing  of  conflicting  system  performance 
and  reliability  goals,  we  will  focus  on  the  hedging  properties  of 
fault- tolerant  controllers. 

1.3  Modelling  Fault-Prone  Systems 

A  key  step  in  the  development  of  any  theory  for  system  design 
and  analysis  is  the  abstraction  of  physical  reality  by  approximate 
but  representative  mathematical  models.  To  study  fault- tolerant 
controllers  we  must  first  develop  models  that  adequately  capture  the 
salient  characteristics  of  fault-prone  systems.  We  need  models  that 
are  sufficiently  realistic  for  the  design  of  good  fault-tolerant 
controllers  and  are  mathematically  amenable  to  detailed  analysis.  We 
also  require  tractable  problems  in  order  to  gain  insight  into  fault- 
tolerant  structures. 

A  characterizing  attribute  of  fault-prone  systems  is  their 
operation  in  different  forms  or  modes.  Fault-prone  systems  experience 
abrupt  changes  in  their  structure  and  state  from  phenomena  such  as 
component  failures  and  repairs,  changing  subsystem  interconnections, 
changes  in  operating  points  and  abrupt  environmental  disturbances. 


Each  system  form  corresponds  to  some  combination  of  these 


events . 

The  state  of  a  fault-prone  system  can  thus  be  decomposed  into 
two  parts:  a  form  process,  which  indicates  the  operational  status 
of  the  system,  and  the  rest  of  the  state  which  we  call  the  x  process. 

A  logical  structure  for  modelling  this  kind  of  arrangement  depicted 
in  figure  1.1.  It  is  a  feedback  connection  of  two  subsystems:  a  form 
subsystem  that  describes  abrupt  structural  changes  in  the  system  and 
an  x- subsystem  that  represents  the  dynamic  evolution  of  the  system 
between  form  transitions. 


Figure  1.1:  General  Hybrid  System  Structure. 

^In  reliability  theory  the  structural  conditions  of  a  system  are  usually 
called  modes  (eg.,  normal  mode,  failure  modes,  etc.).  In  control 
theory  the  term  mode  has  a  different  meaning,  and  a  third  definition 
pertains  to  statistical  analysis.  Since  the  problems  we  are  investi¬ 
gating  draw  from  reliability  theory,  control  theory  and  stochastic 
processes,  we  have  elected  to  avoid  the  term  mode.  Instead,  form  is 
used  to  denote  the  operational  status  or  structure  of  the  system. 


The  form  is  a  stochastic  process  taking  values  in  a  finite  set. 

Its  transition  probabilities  are  dependent,  in  general,  on  the 
x-subsystem  state  and  control  inputs.  The  x-subsystem  is  modelled  by 
deterministic  or  stochastic  finite-dimensional  vector  differential 
or  difference  equations.  The  parameters  of  these  equations  depend  on 
the  form,  which  feeds  into  the  x-subsystem. 

The  use  of  this  kind  of  continuous-plus-discrete-state  structure 
to  model  fault-prone  systems  is  not  new.  For  example,  some  applications 
are  surveyed  in  [67] .  These  systems  have  been  called  stochastic  hybrid 
models  by  Willsky,  et  al  I  75  ]  in  the  analysis  of  electric  power 
systems . 

The  use  of  stochastic  models  when  representing  fault-prone 
systems  is  essential.  As  in  other  control  analysis  applications,  the 
system  model  used  must  successfully  deal  with  sources  of  uncertainty 
such  as 

,  sensor  errors,  measurement  noise 

.  parameter  errors  and  other  modelling  errors 
in  the  mathematical  representation  of  the 
physical  system 

.  external  random  disturbances  (driving  noises) 
that  effect  the  time  evolution  of  the 
system. 

For  fault-prone  systems  an  additional  source  of  uncertainty  comes  from 
random  disturbances  that  alter  the  system  structure.  Deterministic 
system  models  just  cannot  adequately  represent  these  fundamental 
system  characteristics. 
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In  this  thesis  we  restrict  our  attention  to  the  fault- tolerant 


control  of  discrete- time  systems.  There  are  several  reasons  for 
doing  this.  The  increasingly  digital  nature  of  control  technology  and 
the  inexpensive  availability  of  microprocessors  for  components  in 
'smart'  controllers  make  discrete- time  models  appropriate  for  con¬ 
troller  design  and  analysis.  Since  implementability  is  a  required  at¬ 
tribute  of  fault- tolerant  controllers,  it  seems  preferable  to  avoid 
problems  arising  from  the  discrete  approximation  of  continuous- time 
designs,  by  obtaining  discrete- time  designs  directly. 

In  addition  the  discrete-time  formulations  of  these  problems  are 
more  easily  analyzed  than  continuous- time  ones.  When  dynamic  program¬ 
ming  is  used  to  solve  discrete- time  trajectory  control  problems  there 
is  no  partial  differential  equation  that  must  be  solved.  Thus  we  can 
sidestep  the  inability  to  solve  the  Bellman  equation  for  control 
problems  with  x-dependent  form  transition  probabilities*  This  allows 
us  to  gain  considerable  conceptual  insight  into  the  structure  of 
fault- tolerant  control  systems. 

This  research  considers  discrete- time  systems  that  are  special 
cases  of  the  following  model; 


x.  =  A  (r  )x  +  B(r,  )u,  +  S(r  )v 

k+1  k  k  k  k  k  k 


(1.1) 


Pr{rk+1=j |rk=i,  x”+]=x,  \=U'  qk=q^  =  P(i»ji*»u,q) 


(1.2) 


Vl  "  Vl'  W 


(1.3) 


yk  ■  0(rk’*k  *  D<rk)uk +  A(rk>”k 


B 


Ml 


The  'order  of  operations'  is  as  follows: 

(1)  at  time  k  the  system  is  in  state  (x^r^) 

(2)  controls  and  are  chosen 

(3)  during  time  interval  (k,k+l),  x^+1  is 
generated  via  (1.1) 

(4)  then  r^+1  is  generated  according  to  (1.2), 
based  on  k^,  V  ^  and  rk 

(5)  when  the  form  changes  from  r^  to  r^+^, 

x^+1  may  be  "reset"  to  x^+1-  This  resetting 
is  generally  nonlinear. 

(6)  The  output  of  the  x- subsystem,  y^,  is  produced 
by  (1.4). 


This  convention  allows  for  a  failure  or  other  form  change  to  be 
modelled  as  occurring  at  the  final  time  K=N  (when  N<°°) . 

In  the  above 

.  time  index  k  takes  integer  values 
ke{kQ,k0+l,...,N-l,N} 


6  R 


_m 


e  r 


v.  e  r 
k 


m 


x-process 

x- controls 
x-driving  noise 


L3 


x- subsystem  output 


.  w,  £  rP  x-observation  noise 

k 

The  form  process  {r^:  k=kQ,...,N}  takes  values  in  a  finite  set 
rk  6  M  =  {1,2, .. .  ,M>  M<°°  . 

The  form  controls  ‘  ‘ 'Sj-1^  ta^4e  values  in  a  finite  set 

q  e  L  =  {l, 2, . . . ,L>  L<«  . 

A(r^) ,  B(rk),  CCr^),  Dtr^),  H(r^)  and  Afr^)  are  appropriately- 
dimensioned  matrices  where: 


A(r) 

a 

open- loop 

form  r 

x  dynamics  given  the  current 

b  (r) 

a 

x-process 

input  gain  in  form  r 

E(r) 

a 

x-process 

driving  noise  gain  in  form  r 

C(r) 

a 

x-process 

sensor  gain  in  form  r 

D(r) 

a 

input-output  direct  link  in  form  r 

A  ( r) 

a 

x-process 

observation  noise  gain  in  form  r. 

Thus  the  model  (1.1)- (1.4)  is  sufficiently  general  to  allow  represen¬ 
tation  of  form-dependence  in  dynamics,  actuators,  sensors  and  noise. 
The  form  transition  probabilities  p(i,j;x,u,q)  in  (1.2)  must 


obey 


p(i, j»x,u,q)>  0 


for  all  i,  j  €  M 
and  all  x,u,q 


M 

7  p  (i, j ;x,u,q) =1 

j=l 

Tha  noise  processes  {v  }  and  {w  }  cure  assumed  to  be  'white' , 

JC  A 

in  that 

e{  Iv  -E  (v  )  ]  '  [v  -E  (v  )]}  *  0  s^k 

iC  JC  S  S 

E{  [w  -E  (w  )  ]  '  Iw  -E  (w  )  ]  }  =  0  s?<k 

k  K  s  s 

with  unity  variance  matrices  and 

.  all  elements  of  and  w^  are  independent 
(for  all  times  k,s) 

.  all  elements  of  v,  and  x  ,  and  of  w,  and  x  , 

k  s  k  s 

cure  independent  for  all  k>^s 
.  all  elements  of  v  and  w  are  independent  of 

JC  JC 

r  for  all  k>s. 
s  — 

A  crucial  consideration  in  the  modelling  of  fault-prone  systems 
is  the  realistic  representation  of  form  transitions.  There  are  two 
basic  kinds  of  transitions: 

.  independent  of  x 
•  x-dependent 

The  x-independent  form  transitions  occur  as  though  no  x-to-r 
feedback  link  exists  in  figure  1.1.  They  may  be  uncontrolled,  or 


for  each  i  6  M 
and  all  x,u,q 
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i 


4 
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controlled  by  form  controls  {qQ, . . . ,q^  }  that  are  not  chosen  in 

response  to  {xQ, . . . ,xN_^}.  Example  of  x- independent  form  shifts 
are  random  'no  wearout'  component  failures  and  lightning- induced 
failures  in  electrical  power  distribution  systems. 

The  x-dependent  form  transitions  are  always  controllable  in  same 
sense,  either  by  form  controls  (which  can  be  based  on  x^)  or 
through  active  hedging  (by  {u^}  and  the  resulting  x-process) . 

Examples  of  x-dependent  form  shifts  in  electrical  power  distribution 
systems  include  the  restructuring  of  a  system  when  generator-protecting 
relays  and  circuit-breakers  trip,  human  operator  control  actions  based 
on  observation  of  x-dynamics  (such  as  switching  on  auxilary  generators) 
and  transmission-line  failures  due  to  current  overloads.  Thus  form 


shifts  can  be  totally  unpredictable  (as  in  random  'no  wearout'  component 
failures),  totally  predictable  (as  in  scheduled,  deterministic  actions) 
or  partially  predictable  (as  in  the  switching  of  relays  precisely  (or 
approximately)  when  a  random  quantity  reaches  a  given  threshold) . 

Suppose  that  the  "reset"  operation  in  (1.3)  is  linear.  Then  the 
x-subsystem  dynamics  are  linear  in  x  and  control  u  if  the  form  process 
{r^}  does  not  depend  on  x.  For  such  systems  with  x-dependent  forms, 
the  only  source  of  x-dynamics  nonlinearity  is  through  the  form  transition 
probabilities. 

In  this  thesis  we  will  consider  x-dependent  form  transition 
probabilities  that  are  piecewise-constant  in  x  (or  which  can  be 
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approximated  as  such) .  This  yields  a  kind  of  dynamics  model  that 
is  amenable  to  detailed  analysis,  since  it  consists  of  linear  'pieces' 

1.4  Formulating  Fault-Tolerant  Optimal  Control  Problems 

A  general  fault-tolerant  control  system  structure  is  shown  in 

figure  1.2.  Both  the  x-controls  {u., — ,u„  , }  and  the  form  controls 

0  N—l 

{qQ, . .  .  ,q^  }  can  depend  upon  possibly  noisy  observations  of  current 

(or  past)  values  of  the  hybrid  system  state  (x,r).  If  these  quan¬ 
tities  are  not  perfectly  observable  then  the  design  of  x  and  r 
estimators  is  an  integral  part  of  the  overall  fault- tolerant  optimal 
control  problem. 

When  form  transitions  are  x-dependent,  imperfect  knowledge  of 
x  causes  uncertainty  about  future  form  shifts  even  if  r  is  perfectly 
observed.  When  r  is  not  perfectly  observed,  failure  detection  and 
isolation  (hence  form  estimation)  usually  involves  some  combination 
of  hypothesis  testing  ideas  and  dynamic  stochastic  estimators 
(such  as  the  Kalman  filter) .  A  thorough  survey  of  failure  detection 
and  isolation  methods  appears  in  the  survey  paper  [  74]  . 

When  the  form  is  not  perfectly  observed,  the  control  serves  a 
'dual'  purpose.  It  can  be  used  both  to  control  the  state  and  to  probe 
for  information  about  it.  Tradeoffs  between  control  costs,  the  costs 
resulting  from  incorrect  form  detection  and  the  expected  benefits  of 


probing  must  be  considered  in  these  cases.  A  general  discussion  of 
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1.2:  Fault-Tolerant  Control  System  Structure 


this  'dual  control'  phenomenon  first  appeared  in  1960  in  the  work 


of  Fel'dbaum  [24  ]. 

Two  types  of  form  control  actions  are  possible: 

.  indirect  form  control  -  This  is  the  control  of  the 
probabilities  of  form  transitions. 

.  direct  form  control  -  This  involves  control  actions 
that  immediately,  deterministically  change  r. 


An  example  of  indirect  form  control  is  preventive  maintenance,  which 
improves  failures  probabilities  (at  same  incurred  cost) .  Switching  to 
backup  systems  in  anticipation  of  (or  response  to)  failures  is  an 
example  of  direct  form  control. 

Embedded  in  any  fault- tolerant  control  problem  is  an  implicit 
criterion  of  system  reliability.  The  problem  formulation  incorporates 
models  of  failure  occurrences,  and  reflects  the  relative  importance 

of  various  form-dependent  costs. 

^  If 

In  this  thesis  we' propose  .extensions  of  the  well-known  linear 

v> 

quadratic  Gaussian  (LQG)  control  methodology  to  systems  having  1 


randomly  jumping  structures  and  parameters  that  are  described  by 

r  ■  •>  .  '  A 

reliability-theoretic  models.  -We  calj.  this-  the  jump  linear  .quadratic 

\ 

( JLQ)  control  problem. 

V 

The  cost  function  to  be  minimized  is  quadratic  in  the  x-control, 
u^.  If  the  system  is  in  state  (x^,r^)  at  time  i,  we  want  to  minimize 


Wi’ 


N-l 


2  VR(W\  + 

k*i 

*  VW 


r . 

i 


(1.5) 


where  the  expectation  is  with  regards  to  {v  },  {w }  and  {r  }. 
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R:  M  x  L  -*  RmXin  is  a  bounded  positive-definite  symmetric  matrix-valued 
function 

R(r,q)  =  R' (r,q)>  0  all  r,q  (1.6) 

and  Q;RnxMxMxL-»R  is  a  bounded  nonnegative  scalar-valued 
function 


Q(x,r1,r2,q)>  0. 


all  x,r1,r2,q 


(1.7) 


The  optimal  expected  cost-to-go  from  state  (x.,r  )  at  time  k  is 


VW 


min 

{ui,qi:k<i5»-l} 


VW 


(1.8) 


Thus  the  optimal  controls  are  found  by  minimizing  the  expected  value  of 
a  cost  functional  which  may  include: 

.  operating  costs  that  penalize  control  energy  expenditure 
and  system  performance  differently  in  each  form. 

•  jump  costs  that  are  charged  if  and  when  the  form  changes. 

These  might  represent  start-up  or  shut-down  costs  of 
equipment,  or  undesirable  transient  phenomena;  load  shed¬ 
ding  costs  in  electric  power  systems  are  examples. 

.  terminal  costs  dependent  upon  the  final  state  (including 
form)  of  the  system. 

The  control  costs  (and  usually  the  x-cost 

2lxk+l'rk'  rk+l'qk^ 


are  chosen  to  be  quadratic  because  of  the 


wide  applicability  and  good  robustness  properties  of  linear  quadratic 
control  problem  formulations  (see,  for  example,  [3]),  The 


q^-dependence  of  these  costs  models  the  penalties  incurred  in  applying 


form  controls.  The  Q  cost  depends,  in  general,  on  the  current  and 


prior  form  (r^+1  and  rfc)  so  that  jump  costs  can  be  included.  The 


control  sequences  {u^},  {<2^}  are  0°nstrained  to  be  feedback  controls 
of  the  form 


\  =  fk[{ys:kO^S-k},'!-Us,qs:kO^S-k’lJ'{rs:k0^s-k' 


qk  =  gk^ys:k0^s-k^'  ^Us,qs5kO^S-k"1^^rs:kO^S-k}] 


(1.9) 

(1.10) 


That  is,  the  x- control  and  form  control  at  time  k  are  determined 
from  past  outputs,  past  (known)  controls,  and  perfectly  observed 
form  observations. 

Control  problems  for  continuous- time  stochastic  hybrid  systems 
with  x-indpendent  r  have  been  extensively  studied  in  the  literature. 
The  stochastic  hybrid  models  used  are  usually  special  cases  of  those 
analyzed  by  Gihman  and  Skorohod  126].  Under  the  assumption  of  perfect 
observations,  continuous- time  optimal  control  problems  for  a  large 
class  of  system  dynamics,  form  transition  models  and  cost  functionals 
can  be  reduced  to  the  search  for  solutions  of  nonlinear  partial 
differential  equations  using  'verification'  theorems  of  dynamic 
programming . 


Krasovskii  and  Lidskii  [  34  ]  obtained  most  of  the  results 
that  are  currently  available  in  the  literature  for  stochastic  hybrid 
system  control  (with  x-independent  form  processes  and  perfect  state 
observations) .  The  problem  was  studied  later  by  Wonham  [ 76  ] •  He 
obtained  conditions  for  the  existence  and  uniqueness  of  solutions  in 
the  JLQ  case,  and  also  derived  a  separation  theorem  under  Gaussian 
noise  assumptions  for  JLQ  control  problems  with  Markovian  forms  and 
noisy  x  (but  perfect  r)  observations.  Sworder  [  63  ]  obtains  similar 
results  using  a  stochastic  maximum  principle. 

Discrete-time  versions  of  the  JLQ  x-control  problems  for 
stochastic  hybrid  systems  have  not  been  thoroughly  investigated  in 
the  literature.  A  special  case1  of  the  x-independent  JLQ  discrete¬ 
time  problem  is  considered  in  Birdwell  [12  ]  and  [13  ]-[  14], 

A  great  deal  of  work  has  recently  been  done  concerning  the 
modelling  and  analysis  of  jump  processes  like  those  describing  the 
form  subsystems  here.  References  of  particular  note  include  [16-18,20, 
25,42,62,711.  An  excellent  discussion  of  martingale  methods  for  optimal 
control  problems  is  contained  in  [ 2 1  ] - 

This  thesis  focuses  on  systems  where  the  form  observations  are 
not  noisy.  This  has  not  been  done  because  the  noisy  observation  case  is 
unimportant.  The  reason  for  this  problem  restriction  is  that,  even  when 
the  form  is  perfectly  observed,  the  solution  of  control  problems  of  this 


kind  for  the  x-  and  u-dependent  form  transition  probability  cases 
is  very  difficult,  previously  unsolved,  important,  and  useful 
in  terms  of  the  insight  which  it  provides  us  regarding  the  trade¬ 
offs  between  reliability  and  system  performance  goals  in  fault- 
tolerant  controller  designs. 

1. 5  Problems  Addressed  and  Results  Obtained 

Using  dynamic  programming,  several  classes  of  the  discrete¬ 
time  jump  linear  quadratic  (JLQ)  control  problem  formulation  of  the 
last  section  have  been  solved.  In  this  section  these  problems 
and  results  are  surveyed. 

PART  II;  JLQ  Problems  with  x- independent  forms 

In  part  II  of  this  theses  the  'easiest'  class  of  JLQ  problems 
is  examined.  These  involve  systems  with  x- independent,  Markovian 
form  processes. 

The  noiseless  case  is  addressed  in  chapter  3.  The  control 
laws  that  are  obtained  are  linear  in  x,  with  a  different  law  for  each 
form.  The  expected  costs-to-go  are  quadratic  in  x  (for  each  form) . 
All  of  the  control  gains  and  costs  are  obtained  by  solving  off-line 
a  set  of  M  precomputable  Riccati-lihe  difference  equations  (one  for 
each  form) . 
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The  continuous- time  version  of  this  problem  was  first  solved  by 
Krasovskii  and  Lidskii  [  34  ] ,  and  later  by  Wonham  [  76  ]  and  Sworder 
[67].  A  special  case  of  the  discrete-time  result  presented  in 
chapter  3  appears  in  Birdwell  [  12  ] . 

For  infinite  time-horizon  problems,  steady-state  results  similar 
to  those  obtained  in  the  standard  LQG  problem  are  accessible.  An 
interesting  (but,  in  retrospect,  obvious)  fact  is  that  the  controlled, 
closed-loop  dynamics  in  every  form  need  not  be  stabilizing  so  long 
as  the  probability  of  entering  and  remaining  in  these  stable  forms 
is  not  "too  large."  A  similar  but  less  inclusive  sufficient  con¬ 
dition  for  the  continuous- time  version  of  the  problem  was  developed 
by  Wonham  [  76  ]  . 

These  controllers  exhibit  the  desired  adaptability  property  in 
that  different  laws  are  used  in  each  form.  That  is,  the  system 
reorganizes  itself  when  a  failure  occurs  so  as  to  best  use  available 
direct  and  functional  redundancy.  The  controllers  derive  robustness 
and  implenentability  from  the  linear  quadratic  nature  of  the  problem.. 
Passive  hedging  is  used  to  minimize  the  expected  costs.  Tha '  j  , 
potential  failures  and  other  form  changes  are  taken  into  account  (via 
the  cost  functional)  in  the  choice  of  the  optimal  control.  But  no 
active  hedging  (controlled  modification  of  failure  probabilities)  is 
possible  because  of  the  independent,  uncontrollable  nature  of  the 


fair<  x  es. 


In  chapter  4  several  extensions  of  the  x-independent  JLQ 


problem  are  considered.  These  include  the  addition  of  jump  costs, 
linear  resets  of  x,  and  additive  white  input  and  x-observation  noises 
The  presence  of  additive  (usually  Gaussian)  white  observation 
and  input  noise  does  not  complicate  these  problems.  Since  the  form 
is  perfectly  observed  (with  delay) ,  a  separation  theorem  like  that  of 
the  standard  LQG  problem  follows.  In  each  form,  a  Kalman  filter 
estimates  x,  and  this  estimate  is  then  used  by  the  control  law  for 
that  form. 


PART  III:  Scalar  JLQ  Problems  with  x-dependent  forms 

In  part  III  we  consider  JLQ  control  problems  that  involve  state- 
dependent  structural  changes.  These  problems  possess 


perfect  observations  of  the  state  (x^r^)  at  ti®©  k 
quadratic  costs  in  scalar  x^  and  u^,  for  each  form, 


.  no  driving  or  observation  noises, 

.  x  dynamics  that  would  be  linear,  if  not  for 
randomly  jumping  parameters, 

.  jump  probabilities  that  depend  upon  x  in  a 
piecewise-constant  way  (with  finitely  many 
pieces)  or  cure  approximated  as  such. 


For  finite  time-horizon  problems  in  the  x-dependent  case  we 


have  obtained  a  recursive  algorithm  that  determines  the  optimal 


expected  costs-to-go  and  control  laws  off-line,  in  advance  of  system 
operation . 


The  optimal  control  laws  are  piecewise- linear  in  x  (with 

1  0 

x  ,x  terms)  and  the  optimal  expected  costs-to-go  are  piecewise- 

2  10 

quadratic  in  x  (with  x  ,x  ,x  terms).  The  gains  and  costs  are 
obtained  from  a  set  of  precomputable  Riccati-like  equations  (not  the 
same  as  in  the  x-independent  failure  case) .  The  number  of  "pieces" 
grows  only  additive ly  (going  backwards  in  time  from  a  finite  terminal 
time) .  The  additive  increase  depends  upon  the  number  of  different 
forms  that  the  system  can  change  to  (from  its  current  one) ,  and  the 
number  of  pieces  in  the  relevant  piecewise-constant-in-x  transition 
probabilities.  Thus  there  is  a  tradeoff  between  the  accuracy  of  the 
modelling  of  failure  probability  state-dependence  versus  the  conro- 
utational  burden  of  control  law  determination  (and  the  complexity 
of  the  controller. 

The  optimal  controller  attempts  to  minimize  the  cost  incurred 
both  by  the  usual  LQ  regulator  action,  and  by  driving  the  system 
state  to  regions  where  the  likelihood  of  undesirable  form  shifts  is 
reduced.  The  different  "pieces"  of  the  optimal  expected  cost-to-go 
and  control  law  correspond  to  using  the  control  alter  form  transition 
probabilities  at  various  future  times.  That  is,  to  actively  hedge. 

In  general,  for  infinite  time  horizon  problems  the  number  of 
pieces  becomes  infinite.  Fortunately,  for  a  large  class  of  problems 


this  is  not  an  obstacle  to  implementation  because  most  of  the  control 
law  and  cost  pieces  converge.  That  is,  although  the  true  optimal 
control  law  involves  a  (countably)  infinite  number  of  pieces,  each 
valid  over  a  different  range  of  the  x  variable,  most  of  these  pieces 
are  "almost  the  same." 

Thus  there  is  a  tradeoff  between  closeness  to  optimality  and 
controller  complexity.  Nearly  optimal,  steady- state  controllers 
cam  be  obtained  to  within  any  specified  deviation  from  optimal,  but 
with  a  corresponding  level  of  complexity  (number  of  separate- interval 
control  laws) . 

PART  IV:  Extensions  to  the  Scalar  x-dependent  JLQ  Problem 

In  this  part  of  the  thesis  we  extend  the  results  of  chapters 
5-7  to  more  general  JLQ  problems.  In  chapter  8  we  consider  a 
modification  of  the  solution  algorithm  of  Part  III  that  lets  us  solve 
approximately  problems  involving: 

.  x  operating  costs  and  terminal  costs  that  are  piecewise-quadratic  in  x 
(with  x2,  x^  and  x°  terms) 

.  cost  pieces  that  are  concave-up  as  well  as  concave-down. 

This  jump  linear  piecewise  quadratic  (JLPQ)  control  problem  is  solved  using 
a  recursive  algorithm  that  determines  the  optimal  control  law  and  expected 
costs- to-go  off-line.  As  in  the  JLQ  case,  the  optimal  JLPQ  control  laws 
are  piecewise-linear  in  x^>  in  each  form.  The  optimal  expected  costs-to- 


go  are  piecewise-quadratic.  Unlike  the  JLQ  case,  the  number  of  pieces  of 


the  ootimal  JLPQ  controller  may  grow  at  a  faster-than-linear  rate  as  the 
number  of  stages  from  the  finite  terminal  time  increases.  The  piecewise 
structure  of  the  optimal  controller  is  caused  by  both  the  piecewise-con- 
stant  nature  of  the  form  transition  probabilities  (as  in  chaDters  5-7)  and 
by  the  piecewise-quadratic  nature  of  the  x-operating  and  terminal  costs. 

In  chapter  9  we  extend  the  solution  methodology  of  chapters  5-8  to 
address  a  larger  class  of  scalar  jump  linear  control  problems,  possessing 
additive  input  noise  and  a  more  general  class  of  x-dependent  form  transi¬ 
tion  probabilities,  x-operating  costs  and  x-terminal  costs.  Specifically 
we  consider  scalar  jump  linear  control  problems  with  Quadratic  control 
penalties  and 

.  input  noise  densities  that  are  twice  continuously  differentiable 
except  at  a  finite  number  of  points, 

.  x-operating  costs  Q(x,r) ,  x-terminal  costs  QT(x,r)  and  form  tran¬ 
sition  probabilities  p(i,j=x)  consist  of  a  finite  number  of  con¬ 
vex  or  concave  (in  x)  pieces. 

We  call  this  the  jump  linear  piecewise  convex  (JLPC)  control  problem. 

Our  study  of  this  class  of  problems  is  motivated  by  a  desire  to  make  the 
solution  approach  of  chapters  5-8  applicable  to  more  realistic  control 
problems.  The  major  extension  of  chapter  9  is  the  inclusion  of  additive 
input  noise  in  the  x-process  dynamics.  Additive  input  noise  profoundly 
changes  the  nature  of  the  optimal  controller.  The  piecewise-quadratic 
structure  of  the  optimal  cost  and  piecewise-linear  structure  of  the  op¬ 
timal  control  laws  is  lost  due  to  the  "blurring"  effect  of  the  noise.  In 
chapter  9  we  show  how  JLPC  control  problems  with  additive  input  noise  can 
be  reformulated  (at  each  time  stage)  as  different  but  equivalent  JLPC  pro¬ 
blems  that  do  not  possess  input  noise.  These  reformulated  problems  can  be 
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solved  using  the  approach  of  chapters  5-8.  The  optimal  controller  for 
noisy  JLPC  problems  can  be  obtained  following  the  steps  of  an  algorithm 
(presented  in  flowchart  form)  which  generates,  off-line,  the  optimal  con¬ 
trol  laws  and  expected  costs  at  each  time  k  and  from  each  form  j  ,  Since 
the  optimal  control  laws  are  not  piecewise-linear  in  x^,  we  don't  have  the 
nice  inductive  controller  structure  of  the  JLQ  and  JLPQ  problems.  We 
therefore  propose  a  suboptimal  approximation  of  the  JLPC  controller  that 
is  easier  to  determine  and  implement  than  the  optimal  controller.  The 
suboptimal  control  laws  are  piecewise-linear  in  x^  at  all  times  k  (and 
from  each  form  j) . 

In  chapter  10  we  examine  further  extensions  of  the  solution  methodol¬ 
ogy  of  Part  ill.  We  first  consider  jump  linear  control  problems  where  the 
x  process  is  not  scalar.  This  class  of  problems  is  far  more  complicated 
than  the  scalar  case.  However  we  can  obtain  approximate  (suboptimal)  con¬ 
trollers  for  these  problems  using  an  algorithm  based  upon  the  suboptimal 
controller  approximation  of  the  JLPC  problem  (of  chapter  9) . 

We  next  consider  jump  linear  control  problems  involving  u-dependent 
form  transition  probabilities.  This  class  of  problems  is  of  practical  im¬ 
portance  since  it  captures  the  issue  of  actuator-dependent  failures  and  it 
allows  us  to  examine  conflicts  between  system  performance  goals  and  relia¬ 
bility  requirements.  The  control  problems  (for  scalar  x  and  u)  can 
be  solved  using  a  modified  version  of  the  solution  algorithm  of  Part 
III.  At  each  time  stage  the  optimal  expected  cost  is  piecewise-quad- 
ratic  in  x. 

In  chapter  10  we  also  consider  JLO  problems  where  the  form  process 
can  be  controlled  on  the  basis  of  observed  x^_  and  r^  values.  This  allows 
us  to  study  controllers  that  use  strategies  such  as  preventive  maintenance, 
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switching  to  backup  systems  in  anticipation  of  failures  and  the  like. 


Both  direct  form  control  (deterministically  switching  between  forms)  and 
indirect  form  control  (altering  form  transition  probabilities)  are  consi¬ 
dered.  For  scalar-x  versions  of  these  problems  with  x-independent  form 
transition  probabilities  (if  no  form  controls  are  applied) ,  after  one  time 
stage  (backwards  from  a  finite  terminal  time)  the  optimal  control  problem 
resembles  the  x-dependent  JLPQ  problems  of  chapter  8.  The  optimal  expected 
costs-to-go  are  piecewise-quadratic  in  x:  and  are  indexed  by  the  choice  of 
form  control  q  as  well  as  the  current  form  r  ,  at  each  time  k.  The  opti- 
mal  controller  must  determine  the  best  form  control  option  on-line,  given 
observations  of  (x^,r,^)  .  These  choices  are  based  upon  parameters  that  are 
computed  (off-line)  by  Riccati-like  difference  equations,  in  a  modification 
of  the  algorithms  of  chapter  i  7-9. 


PART  V:  Conclusions  and  Suggestions  for  Future  Research 

In  chapter  11  we  summarize  the  results  of  this  thesis  and  we 
identify  a  number  of  specific  and  more  general  directions  for 
future  research. 

In  conclusion,  this  thesis  considers  the  control  of  dynamic 
systems  subject  to  abrupt  structural  changes  at  random  times.  It 
is  motivated  by  the  need  for  design  techniques  that  yield  fault- 
tolerant  systems.  This  thesis  concentrates  on  the  tradeoffs  and 
conflicts  between  system  reliability  and  performance  goals. 
Specifically,  we  consider  the  attainment  of  fault- tolerance  through 
control  strategies  rather  than  by  direct  rtdundancy.  This  is,  of 


course,  only  part  of  the  overall  fault-tolerant  design  problem. 
However  the  problem  formulations  here  capture  many  important  issues. 
We  believe  that  the  problems  that  are  addressed  and  the  results 
obtained  in  this  thesis  provide  am  important  step  in  the  development 
of  a  general  theory  of  fault- toleramt  control. 
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2.  BACKGROUND  AND  RELATED  LITERATURE 

The  design  of  fault  tolerant,  failure-resistant  dynamically- 
reliable  control  systems  is  a  problem  that  falls  within  the  scope 
of  both  automatic  control  theory  and  reliability  theory.  The  purpose 
of  this  chapter  is  to  provide  background  for  this  investigation 
from  both  of  these  fields,  and  to  survey  results  relating  to  the 
design  of  fault- tolerant  control  systems. 

In  section  2.1  we  consider  the  relationship  between  the  fault- 
tolerant  control  problem  and  reliability  theory.  In  section  2.2  we 
will  describe  approaches  to  the  design  of  fault-tolerant  control 
systems  that  are  distinctly  different  from  the  methods  we  are  con¬ 
sidering.  More  closely  related  work  on  the  control  of  jumping  para¬ 
meter  systems  is  discussed  in  section  2.3. 

2.1  Relations  to  Reliability  Theory 

Reliability  engineering  is  primarily  concerned  with  the  design 
and  analysis  of  systems  that  can  perform  their  missions  with  high 
probability  despite  component  failures. 

Reliability  developed  as  am  engineering  discipline  in  response 
to  the  military  requirements  of  World  War  II.  The  first  formal 
reliability  study  reportedly  (see  I  23  ])  sought  to  explain  why 
German  VI  and  V2  missies  performed  so  poorly  despite  their  construction 


from  highly  reliable  components. 


Following  the  war,  complex  system  design  problems  in  the 
electronic,  nuclear,  aircraft  and  space  industries  gave  impetus 
to  the  field.  Most  of  this  early  work  involved  the  modelling  of 
failure  phenomena  and  the  collection  of  component  failure  data. 

Early  theoretical  considerations  of  reliability  in  the  context 
of  automata  theory  (Von  Neumann  I  73])  and  reliable  circuit  synthesis 
(Moore  and  Shannon  [  43  ] )  concerned  achieving  overall  reliability 
through  the  "proper"  use  of  unreliable  components. 

The  first  book  on  reliability  (by  Bazovsky)  did  not  appear  until 
1961  I  8  ]•  It  was  followed  by  a  number  of  texts  in  the  early  1960's, 
such  as  16],  [  19  ] ,  [  28  ]  f  [  46  ] ,  [  52  ] ,  [  54  ]  ,  [  72  ]  and  (  81  ]  . 

Three  more  recent  texts  are  t  29  ] ,  [  43  ]  and  [ 2  3  ] .  The  works  of 
Gnedenko,  et.al  [27  ]  and  Barlow  and  Proschan  [  7  ]  provide  more 
mathematically  rigorous  treatments  of  reliability  theory. 

Current  activity  in  reliability  theory  consists,  in  large  part, 
in  the  development  of  mathematical  theories  and  associated  computerized 
algorithms  for  the  analysis  of  reliability  characteristics  for  systems 
composed  of  highly  reliable  components.  In  most  contemporary  engineering 
applications,  many  (or  all)  of  a  system's  component  parts  must  be 
extremely  reliable  if  strict  system  reliability  standards  are  to 
be  met.  One  motivation  for  the  development  of  a  dynamic  control  ap¬ 
proach  to  reliability  engineering  is  the  existence  of  problems  (for 
example,  electric  power  systems)  in  which  the  system's  dynamics  and  its 


reliability  are  intrinsically  intertwined.  Also  the  use  of  controls 
to  achieve  reliability  may,  in  some  applications,  facilitate  the  use 
of  fewer  and  less  reliable  (that  is,  less  expensive)  components  in  the 
design  of  reliable  systems. 

There  are  two  basic  approaches  that  are  currently  used  for  the 
reliability  analysis  of  complex  systems  (or  proposed  designs) .  One 
approach  might  be  called  the  'static'  consideration  of  system 
reliability.  This  kind  of  analysis  seeks  to  determine  the  probability 
that  a  given  system  will  not  fail  (or  will  achieve  various  degraded 
modes  of  operation)  after  some  fixed  time  interval,  based  on  a  priori 
information  about  the  components,  their  connections,  etc.  Some 
examples  of  this  static  approach,  which  involves  fault-trees,  cut  sets, 
graph  theory  and  the  like  are  in  [  39  ] ,  [  44  ]  . 

A  second  approach  to  system  reliability  analysis  focuses  on  the 
dynamic  behavior  of  system  failure  probabilities.  It  involves  the  use 
of  queueing  theory  models  of  complex  systems.  Queueing  systems  might 
be  thought  of  as  combinations  of  sequences  of  elementary  operations 
such  as  single  component  failures,  repairs  or  replacements,  maintenance 
fault  searches  and  detections,  successful  component  operation  prior  to 
failure,  etc.  These  elementary  operations  overlap  in  time,  in  general. 
They  are  usually  considered  to  be  independent  of  each  other;  depending 
only  on  the  operational  status  of  the  overall  system. 


The  outlook  of  this  thesis  is  in  the  spirit  of  this  second  ap¬ 
proach  to  reliability  analysis.  However,  we  are  particularly  concerned 
with  the  dynamic  performance  of  systems  and  the  evolution  of 
(continuous- state  space)  physical  quantities  as  well  as  the  failure 
status  of  the  components  that  manipulate  these  quantities.  We  want 
to  formulate  control  problems  that  achieve  good  system  performance 
and  high  reliability.  It  is  important  to  realize  that  the  goals  of 
reliability  and  performance  may  be  conflicting.  For  example,  the  use 
of  a  large  control  to  quickly  drive  the  system  into  a  safe  region  of 
the  state  space,  so  as  to  reduce  the  probability  of  a  failure,  may 
entail  a  large  control  cost.  On  the  other  hand  the  use  of  control  to 
maximize  performance  may  result  in  a  loss  of  system  reliability. 
Reliability  considerations  often  limit  the  performance  that  can  be 
obtained  from  a  system;  electric  power  systems  are  an  example  of  this. 

The  motivation  for  our  work  is  a  desire  to  obtain  a  systematic, 
objective  means  for  designing  systems  that  take  into  account  the  need 
for  both  high  reliability  and  performance  and  also  account  for  possible 
intrinsic  conflicts  between  these  goals.  Consequently  such  systems 
should  use  available  system  redundancy  in  a  quantifiably  efficient 
manner. 

2.2  Other  Approaches  to  Fault-Tolerant  Control 

A  number  of  approaches  to  the  design  of  fault-tolerant  control 
systems  that  are  distinctly  different  from  the  methods  used  here  have 
been  considered  previously.  We  will  survey  them  here.  In  the  next 
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section  we  will  then  consider  previous  work  that  is  more  closely 
related  to  ours,  and  we  will  indicate  how  previous  efforts  differ 
from  the  work  of  this  thesis. 

A  mathematical  framework  for  building  reliable  control  systems 
through  the  use  of  redundant,  less  reliable  controllers  is  presented 
in  the  work  of  Siljak  [61  ].  This  approach  is  a  direct  extension, 
in  spirit,  of  the  work  of  van  Neumann  [73  ]  for  automata,  Moore 
and  Shannon  [43]  in  synthesizing  reliable  circuits,  and  Barlow  and 
Proschan  [  7  ]  in  constructing  reliable  system  from  unreliable  com¬ 
ments.  In  [61  ],  control  reliability  is  defined  to  be  the  probability 
that  a  given  control  structure  will  insure  stability  of  the  controlled 
system  under  a  specified  class  of  failures  which  occur  with  known 
probabilities.  Experimental  observations  indicate  high  reliability 
of  decentralized  control  schemes  for  large  systems  with  respect  to 
structural  perturbations  of  interconnections  and  nonlinearities  of 
subsystem  couplings  [59  ] ,  I  60  ] ,  I  22  3  and  low  reliability  of  these 
same  decentralized  strategies  when  the  system  is  subject  to  structural 
perturbations  in  feedback  interconnections  and  controller  failures. 

The  main  reason  is  that,  in  reliability-theoretic  terms,  decentralized 
controllers  are  generally  series  connections  of  controllers;  hence 
any  one  controller  failure  can  cause  total  system  failure.  The 
natural  solution  suggested  by  reliability  theory  is  to  introduce  a 
kind  of  parallel  controller  action,  through  multiple  control  systems 


that  have  "functional"  redundancy  (i.e.,  overlap  of  capabilities). 

This  is  explored  in  [  61  ]  .  This  kind  of  overlapping  decentralized 
control  system  decomposition  has  been  used  for  the  modelling  and  control 
of  a  string  of  high-speed  vehicles  t  4  ]  and  in  freeway  traffic  flow 
regulation  [  32  ] . 

Another  approach  to  the  analysis  of  reliable  systems  appears  in 
the  work  of  Beard  [  9  ].  He  examines  'self-reorganizing'  linear 
systems  which  restructure  themselves  to  compensate  for  actuator  and 
sensor  failures,  using  the  functional  redundancy  of  their  components. 
Beard's  approach  is  to  identify  any  change  (from  a  set  of  known  pos¬ 
sibilities)  and  then  to  attempt  to  alter  the  system's  feedback  control 
law  so  as  to  achieve  closed- loop  stability.  He  obtains  bounds  on  the 
number  of  actuators  and  sensors  needed  (that  is,  the  level  of  component 
redundancy)  using  controllability  and  observability  criteria. 

A  third  method  for  achieving  fault-tolerant  designs  makes  no 
explicit  reference  to  reliability  theory.  This  approach  is  to  try  to 
obtain  a  kind  of  "passive"  fault- tolerance  through  the  design  of  non- 
adapting^  robust  controllers  that  attempt  to  provide  satisfactory  control 
in  all  forms.  The  fundamental  work  on  the  robustness  of  feedback 
systems  is  that  of  Bode  [  15  ] .  These  results  were  extended  by  Horowitz 
[  30  ] ,  [  31  ] ,  Kriendler  [35  ]  and  others,  and  by  Kwakernank  and  Sivan 
(I  38],  p.427)  in  the  discrete-time  case.  Geometric  approaches  to  the 
analysis  of  robustness  properties  of  feedback  controllers  have  been 


used  by  Wong  [  77  ]  ,  [  78  ] ,  Zames  [  79  ]  ,  [80]  and  Safonov  and  Athans 
[  55  ]  ,  [  56  ] ,  [  57  ] .  In  particular,  Safonov  [  56  ]  has  obtained 
conditions  characterizing  the  robustness  of  controllers  when  parameter 
variations  result  from  a  change  of  operating  points  in  a  nonlinear 
system.  The  recent  thesis  of  Lehtomaki  [  41  ]  provides  a  common  frame¬ 
work  for  these  and  new  robustness  tests. 

An  alternative  approach  to  the  design  of  fault- tolerant  controllers 
is  the  use  of  actively  adapting  controllers  that  respond  to  changes  in 
the  operating  environment.  There  are  a  large  number  of  diverse  problem 
approaches  and  formulations  that  go  by  the  name  'adaptive  control' ,  some 
of  which  are  relevant  to  fault- tolerant  control.  We  will  not  review 
these  here  since  general  excellent  surveys  exist  (see,  for  example 
[  3  1  ,  [  1  1  ,  and  [  40  ] ) . 

2.3  Control  of  Jumping  Parameter  Systems 

In  this  thesis  we  consider  control  problem  formulations  that 
explicitly  include  the  possibility  of  system  failures  and  structural 
changes.  We  propose  extension  of  the  well  known  linear  quadratic  (LQ) 
control  problem  to  include  systems  having  randomly  jumping  parameters, 
and  costs  that  reflect  these  changes  in  system  structure.  As  discussed 
in  Section  1.4,  in  this  way  we  hope  to  capture  some  of  the  reliability 
and  performance  tradeoffs  in  the  fault  tolerant  control  problem.  We 
call  this  the  jump  linear  quadratic  (JLQ)  control  problem. 
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Control  problems  involving  systems  having  jumping  parameters  are 
not  new.  For  example,  some  applications  are  surveyed  in  [67].  These 
continuous-plus-discrete- state  models  have  been  called  stochastic 
hybrid  models  by  Willsky,  et.al  I  75  ]  in  the  analysis  of  electric 
power  systems.  Control  problems  for  continuous- time  stochastic  hybrid 
systems  having  state  and  control— independent  discrete- state  parts 
(i.e.,  x- independent  form  processes  in  the  terminology  of  section  1.3) 
have  been  extensively  studied  in  the  literature. 

The  stochastic  hybrid  models  used  are  usually  special  cases  of 
those  analyzed  by  Gihman  and  Skorohod  [26  ] .  Under  the  assumption  of 
perfect  observations,  continuous- time  optimal  control  problems  for  a 
large  class  of  system  dynamics,  form  transition  models  and  cost  func¬ 
tionals  can  be  reduced  to  the  search  for  solutions  of  nonlinear  partial 
differential  equations  using  'verification'  theorems  of  dynamic  pro¬ 
gramming.  Krasovskii  and  Lidskii  [34  ]  obtained  most  of  these  results 
that  are  currently  available  in  the  literature  for  stochastic  hybrid 
system  control  (with  x-independent  form  processes  and  perfect  state 
observations) .  The  problem  was  studied  later  by  Wonham  [  76  1 •  He 
obtained  conditions  for  the  existence  and  uniqueness  of  solutions  in 
the  JL Q  case,  and  also  derived  a  separation  theorem  under  Gaussian  noise 
assumptions  for  JLQ  control  problems  with  Markovian  forms  and  noisy  x 
(but  perfect  r)  observations.  Sworder  [  63  ]  obtains  similar  results 
using  a  stochastic  maximum  principle  and  has  published  a  number  of 


extensions  with  his  co-workers,  including  [ 45  ]  ,  [  64  ] ,  [  65  ] ,  [  66  ] , 

[  68  ] ,  [  69  ]  .  Stochastic  minimum  principle  formulations  for 
continuous  time  problems  involving  jump  process  have  also  been  considered 
by  Rishel  ([  48  ],[  49  ],[  50  ],[  51  ]),  Kushner  [  36  ] ,  and  others. 

Robinson  and  Sworder  I  53  ]  ,  [70  ]  have  derived  the  appropriate 
nonlinear  partial  differential  equation  for  continuous- time  jump  para¬ 
meter  systems  having  state  and  control-dependent  rates.  A  similar  result 
appears  in  the  work  of  Kushner  [  36  ]  and  an  approximation  method  for  the 
solution  of  such  problems  has  been  developed  by  Kushner  and  DiMasi  [37  ]. 

This  is  important  work  but  technical  issues,  such  as  the  lack  of  existence 
of  closed  form  solutions,  make  it  difficult  to  expose  how  the  optimal 
controller  effects  the  tradeoff  between  performance  and  reliability. 

The  major  focus  of  this  thesis  (i.e.,  part  III)  is  on  systems 
subject  to  structural  form  changes  that  can  be  implicitly  controlled, 
through  the  dependence  of  form  transition  probabilities  on  the  continuous 
part  of  the  state.  This  dependence  allows  for  the  modelling  of 
conflicts  between  performance  and  reliability  goals.  We  choose  to 
consider  discrete-time  versions  of  the  jump  linear  quadratic  (JLQ) 
control  problem,  rather  than  extend  the  continuous- time  x-dependent 
results  of  Sworder  [53],  [70  ]  because  the  discrete-time  formulation  is 
amenable  to  detailed  analysis.  In  discrete  time  we  can  get  insight  into 
how  the  optimal  controller  balances  reliability  and  performance  goals. 
Qualitative  fault- tolerance  concepts  such  as  active  hedging  can  be  quanti" 
fied  in  the  discrete-time  setting. 
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The  control  of  jumping  parameter  systems  in  discrete  time  have 
not  been  as  thoroughly  investigated  as  in  continuous  time.  The  only 
results  available  in  the  literature  are  for  x- independent  JLQ  problems 
where  the  actuator  is  form-dependent.  These  are  considered  in  the 
thesis  of  Birdwell  [  12  ]  and  in  [  13  ] ,  [  14  ] . 

As  a  preliminary  step  in  our  investigation  we  also  consider 
discrete-time  JLQ  problems  with  x- independent  forms.  The  derivation 
of  the  basic  result  is  straightforward  and  analogous  to  the  continuous 
time  problem  for  finite  time  horizons.  We  obtain  some  interesting 
results  regarding  infinite  time  horizon  problems,  including  necessary 
and  sufficient  conditions  for  the  existence  of  steady- state  optimal  con 
trollers.  These  results  are  stronger  than  the  corresponding  continuous 
time  sufficient  conditions  obtained  by  Wonham  [  76  ] ,  and  they  provide 
significant  insight  into  the  different  types  of  behavior  that  can  be 
exhibited  by  JLQ  systems. 

2.4  Summary 

In  this  thesis  we  consider  the  design  of  fault-tolerant  control 
systems  through  the  jump  linear  quadratic  control  problem  formulation 
that  was  introduced  in  Chapter  1.  These  problems  involve  the  control 
of  continuous-plus-discrete  state,  stochastic  hybrid  systems. 


Continuous  time  control  problems  for  such  systems  have  been 
extensively  studied  in  the  x- independent  form  case  (with  perfect  form 


observations) .  The  continuous- tine  x-dependent  case  leads  to 
nonlinear  partial  differential  equations  that  are  analytically 
intractible,  although  approximation  techniques  have  been  proposed. 

The  results  available  for  the  continuous- time  case  don't  expose  how 
the  tradeoff  between  reliability  and  performance  is  effected  by  the 
optimal  controller. 

We  consider  discrete  time  problems  in  order  to  obtain  some 
understanding  of  the  control  tradeoffs  involved  between  system 
performance  and  reliability  goals/  when  structural  changes  and  failures 
depend  upon  the  continuous  part  of  the  state.  The  main  focus  of  this 
thesis  is  on  problems  involving  x-dependent  form  transitions  since 
this  dependence  allows  for  the  modelling  of  conflicts  between  perfor¬ 
mance  and  reliability.  To  the  best  of  our  knowledge,  discrete-time 
problems  with  this  x-dependence  have  not  been  studied  previously  in 
detail. 
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SELESS  MARKOVIAN- FORM 


OPTIMAL  CONTROL  PROBLEMS 


3.1  Introduction 

In  this  chapter  we  consider  a  special  class  of  the  jump  linear 
quadratic  (JLQ)  control  problem  formulation  in  chapter  1.  We  examine 
the  optimal  control  of  jump  linear  systems  having 
.  x-independent  Markovian  form  processes 
.  perfect  state  observations  and  no  noises 
.  purely  quadratic  operating  and  terminal  costs 
.  no  'resets'  of  x  when  the  form  changes. 

This  class  of  problems  is  formulated  and  solved  in  sections  3. 2-3.3. 

The  optimal  control  laws  are  linear  in  x^  (a  different  law  for  each  form) 
and  the  optimal  expected  costs-to-go  are  quadratic  in  x^.  These  control 
laws  and  costs  cam  be  computed  off-line,  in  advance  of  system  operation, 
by  solving  M  coupled  Riccati-like  matrix  difference  equations. 

The  continuous-time  version  of  this  problem  was  first  formulated 
and  solved  by  Krasovskii  and  Lidskii  [  34  ]  ,  and  later  by  Wonham  170  ]  and 
Sworder  [ 63] .  A  special  case  of  the  discrete-time  result  presented  here 
appears  in  Birdwell  I  12-14  J  , 

The  solution  of  the  discrete-time  JLQ  problem  that  is  developed  here 
is  a  necessary  logical  first  step  in  the  study  of  more  general  control 
problems  for  systems  with  abruptly  changing  structure  which  will  be 
used  in  later  chapters.  The  controller  derivation  presented  here  is 


conceptually  straightforward .  However  study  of  the  optimal  controller 


provides  valuable  insights  into  the  qualitative  behavior  and  stability 
properties  of  jump  linear  systems.  Several  of  these  properties  are 
highlighted  by  example  problems  in  section  3.4. 

In  section  3.5  the  steady-state  control  problem  is  considered. 
Necessary  and  sufficient  conditions  are  derived  for  the  existence  of 
a  set  of  steady-state  constant  expected  cost-to-go  functions.  It  is 
shown  that  the  corresponding  set  of  time- invariant  steady-state  control 
laws  stabilizes  the  controlled  system,  in  that  e{x^  x^}  •+■  0  as 
(k-kQ)-*  °°  and  that  the  steady-state  control  laws  minimize  the  limiting 
expected  cost-to-go  as  (N-kg)-*  00 ,  with  finite  optimal  expected  cost. 

A  more  restrictive  sufficient  condition  for  the  continuous- time 
version  was  developed  by  Wonham  [  76  ]  .  To  the  best  of  our  knowledge, 
the  discrete  time  steady-state  results  are  new. 

3.2  Problem  Formulation 

Consider  the  discrete-time  jump  linear  system 

-  Vr*)xk  +  Vr*’\ 

pr{tk+i‘jlvi)  ’  W1'3’ 

where 


(3.1) 

(3.2) 


x(kj  =  ,  r  (k  )  =»  r 


In  the  above  we  have 


.  time  index  k  takes  integer  values 
k  e{k  ,k  +1, . . . ,N-1,N> 

_  _n 

.  6  R  x-process 

m 

.  6  R  x-control 

The  form  process  {r  :  k»k  , is  a  finite-state  Markov  chain 

w 

taking  values  in 

r.  e  M  =  {l,2,...,M}  M<®  . 

That  is, 

Pr-tr^^jlr^r^...,^}  -  pr{rk+1*  j  I  rk> »  Vje  M  and  k 

where  the  form  transition  probabilities  p^Cifj)  in  (3.2)  must 
satisfy 

Pk(if  j)>.  0  V  i,  j  e  M_  and  k 

M 

£  P,  (i,j)=l  Vi  6  M  and  k  • 

j-1 

Here  A(*)  and  B(-)  are  appropriately  -dimensioned  matrices  where, 
for  i  e  M_ 

A(i)  =  open- loop  x  dynamics  in  form  i 
B (i/  =  x-process  input  gain  in  form  i. 


The  cost  criterion  to  be  minimized  is 


JK()(Vro>  * E)  J  [VVrk)uk  * 


+  VW'S. 


1(3.4) 


The  RR(j),  Qk+1(j)  (for  each  k»0,...,N-l)  and  K^(j)  are  positive- 
semidefinite  symmetric  matrices  for  each  j€M  where 


^(j)  +  B^(j) 


M 


J1Pwi«'i,Wi)l  Vj)>0 


) 


(3.5) 


In  particular,  (3.5)  is  satisfied  if 


Vj)> 


Qk(j)>  o 


for  all  j  €  M  and  times  k 


The  x  'K  (r..)x.  term  is  a  terminal  cost  in  addition  to 
N  T  N  N 


VVVv 


3.3  Problem  Solution 

The  optimal  control  law  can  be  derived  using  dynamic  programming 
(10].  Let  V  (x  ,r  )  be  the  expected  cost-to-go  from  state 

/C  JC  J> 


(x.  ,r,  )  at  time  k: 


where  the  optimal  gains  ^ (j )  are  given  by 


_ 

1 

M 

rsk(i)n 

|Vi(1)  + 

I  PkU,i> 

.  i=l 

K 

+ 

Lvi>JJ 

i-l 
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i-l 


Vi} 


+ 

V1’  j 


for  each  j  e  M, and  the  sequence  of  sets  of  positive  semi-definite 
synmetric  matrices  {k.  ,  (j):  j  e  m}  satisfies  the  set  of  M  coupled 
matrix  difference  equations 


Vi(jl  • 

j  III 

with  terminal  conditions 

Vj>  *  KT(j)  * 


m 

Vi)_ 

- » 

> 

_ 1 

2  Pk<j>i> 

+ 

i=l  * 

.V11. 

. 

rBk-l(j)Lk-l(j)J 

(3.8 


The  value  of  the  optimal  expected  criterion  (3.4)  that  is  achieved 
with  these  control  laws  is  given  by 


X0  \(I0U0 


The  proof  of  this  Proposition  is  contained  in  Appendix  B.l. 


Note  that  the  {K^Cj):  j  e  m}  and  optimal  gains  {L^Cj):  j  6  m} 
can  be  recursively  computed  off-line,  using  the  M  coupled  difference 
equations  (3. 7)- (3. 8).  The  M  coupled  Riccati-like  matrix  difference 
equations  cannot  be  written  as  a  single  nM-dimensional  Riccati-equation, 
because  of  the  inverse  terms.  Proposition  3.1  essentially-*"  appears 
in  Birdwell's  thesis  [12  ],  where  it  is  called  the  switching  gain 
solution. 


3.4  Examples  and  Discussions 

In  this  section  some  qualitative  aspects  of  the  JIjQ  controller 
given  in  Proposition  3.1  are  illustrated  via  example  systems.  For 
convenience,  the  examples  considered  here  are  time- invar ian t  and  scalar 
in  x  with  M=2  forms.  That  is, 


*k+i  -  Vk  *  bi\ 


if  t,  -1 
k 


Vi  ■  Vk  +  Vk 


if  r,  -2 
k 


min  E 


|  Jo  +  "^'V]  +  VVV 


with  form  transition  probabilities  as  shown  in  Figure  3.1. 


Time- invariant  parameters  with  A,  R,  Q  independent  of  the  form  r. 
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From  Proposition  3.1  we  see  that  the  optimal  expected  costs-to-go 


and  control  laws  are 


where 


and 


VW31  =  V3’ 
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j*l/  2 

j=l#  2 


Vi(jl  =  b^a 
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for  j=l,2  and  K=N,N-1, . . . ,0. 


The  closed-loop  optimal  system  thus  obeys 


vbj  hif1  +  )  +  ^f2  + 

'  \  Vi(1)/  \Vi' 


(3.11) 


for  k=0,l, . . . ,N-1  and  rk=j. 

The  J  Kj^  C j ) »  j  6  M  |  may  or  may  not  converge  as  k  decreases  from 
N,  and  may  or  may  not  be  driven  to  zero,  as  shown  in  the  following 
examples. 

Example  3.1:  Here  is  an  example  in  which  the  converge  quickly 

and  x  is  driven  to  zero.  Let 


*k+l  =  Xk  +  Uk 


if  r,  =1 
k 


if  r,  =2 
k 


with 


Vj) 


i,j=l,2 


The  optimal  costs,  control  gains  and  closed-loop  dynamics  (computed 


using  (3.9)- (3.11) )  are  given  in  Tables  3.1  and  3.2,  for  four 


iterations: 


Kk(D=Lk(l) 

(2)=L^ (2) 

k=N-l 

.5 

.8 

k=N-2 

. 6226415 

.868421 

k=N-3 

.6357717 

.87472 

k=N-4 

.6370559 

.875327 

Table 

3.1:  Optimal  Gains 

and  Costs  of  Example  3 . 1 

VbiLk<1) 

VVk(2) 

k=N-l 

.5 

.4 

k=N-2 

. 3773585 

.263158 

k*N-3 

.3642283 

.25056 

k-N-4 

.3629441 

. 249346 

Table  3.2;  Closed- Loop  Dynamics  of  Example  3.1 


are 


The  expected  cost  parameters 
converging  as  (N-k)  increases, 
systems,  which  are  stable 


K^(j)  and  optimal  gains  L^(j) 

The  same  is  true  for  the  closed- loop 


iarbjV3)i<  1 

for  all  times  k=N-l,  N-  2 , . . . , 0  and  j  S  M.  Conditions  for  convergence 
will  be  addressed  in  the  next  section. 

In  the  'worst  case'  of  r^=2  for  all  times  k=0,l,..., 

lim  |xJ  <  lim  (.5)N_1jx  j  =0. 

N-x»  U 

Thus  x  is  driven  to  zero  by  the  optimal  controller. 

This  example  demonstrates  the  passive  hedging  behavior  of  the 

optimal  controller.  That  is,  possible  future  form  changes  and  their 

associated  costs  are  taken  into  account.  To  see  this,  consider  the 

usual  LQ  regulator  gains  and  cost  parameters  (as  if  P^1=P22=1  and 

P,  =P„  =0) ,  which  axe  listed  in  Table  3.3 
12  21 


\(1)  - 
(with  P^-l) 

1^(2)  =  1^(2) 

(with  P22=l) 

k=N-l 

.5 

.8 

k-N-2 

.6 

.8780487 

k*=N-3 

.6153846 

.8825214 

k»N-4 

. 617647 

.8827678 

Table  3.3;  Standard  LQ  Solution  for  Example  3.1. 


Comparing  Tables  3.1  and  3.3,  note  that  for  k<N-2  the  gains 
of  the  Proposition  3.1  JLQ  controller  are  modified  (relative  to 
LQ  controller)  to  reflect  future  form  changes  and  costs.  The  JLQ 
controller  has  higher  r«l  gains  to  compensate  for  the  possibility 
that  the  system  might  shift  to  the  more  expensive  form  r=2.  similarly, 
the  r=*2  gains  are  lower  in  the  JLQ  controller. 

Example  3.2:  Here  is  an  example  where  the  optimal  closed- loop 

systems  in  different  forms  are  not  all  stable ,  although  the  expected 
value  of  x  is  driven  to  zero.  Let 


+  \ 

if  r,  =1 
k 

*k+l  '  2*k  *  \ 

U  V2 

P  =  P  =Q 

11  21  ** 

P12  *  P22  ■  -1 

.  1 


Q(j)  =  1 

and 

R(l)  =*  1 
R(2)  =  1000. 


Thus  there  is  a  high  penalty  on  control  in  form  2. 


This  system  is  nine  times  more  likely  to  be  in  r=l  than  in  r=2 
at  any  time.  We  might  expect  that  the  optimal  control  strategy  may 
tolerate  instability  while  in  the  expensive- to- control  form  r=2,  since 
the  system  is  likely  to  return  soon  to  the  form  r=l  where  control  costs 
are  much  less.  Computation  of  (3.8)- (3.11)  for  four  iterations 
demonstrates  this ,  as  shown  in  Tables  3 , 4  and  3.5. 


.5 

3.996004 

.5 

1.998x10 

. 6490736 

7.384818 

.6490736 

3.67203x10' 

. 6990352 

9.2692147 

. 6990352 

4.60253x10' 

. 7187893 

10.198343 

.7187893 

5.06036x10' 

Table  3.4:  Optimal  Gains  and 

Costs  of  Example  3.2. 

k«N-l 


.5 


1.998002 


k=N-2 

.3590264 

1.996328 

k=N-3 

.3009648 

1.9953975 

k=N-4 

.2812107 

1.9949396 

Table  3.5;  Closed- Loop  Optimal  Dynamics  of 
Example  3.2. 

These  quantities  are  converging  as  (N-k)-*  ®.  Note  that  the  closed- loop 
system  is  unstable  while  in  r=2. 

Direct  calculation  of  the  expected  value  of  x^,  given  xQ  and  rQ, 
shows  that  decreases  as  k  increases.  This  is  shown  in  Table  3.6 


if  r  =1  if  r  =2 

0  0 


X0 

1.0 

1.0 

X1 

.28121 

1.99494 

E{x2> 

.13228 

. 93844 

E{x3> 

,06915 

,49057 

E{x4} 

. 04493 

.31877 

Table  3.6:  E{x.  }  for  Example  3.2 


In  four  time  steps,  e{x}  is  reduced  by  over  95%  in  form  1  and  68%  in 
form  2.  Note  that  if  the  system  starts  in  the  expensive- to- control 
form  r=2,  x  is  allowed  to  increase  for  one  time  step  (until  control 
while  in  r=l  is  likely  to  reduce  it) . 

Example  3.3;  This  example  illustrates  how  'small'  errors  in  the  modelling 
of  trams it ion  probabilities  near  zero  or  one  can  cause  large  differences 
in  the  JLQ  optimal  controller.  Let 

*k+l  =  *k  +  Uk 
*k+l  =  *k 


P12  small 


where  N=4 

V1 

V° 

Rj-100 

V° 

K(D=0 

T 

K  (2)=108 


The  system  starts  in  form  r^l.  If  a  failure  occurs  at  time  k  (that 
is,  rk"2)  then  a  cost 


Vw2’  -  \  V2) 

is  charged-  But  since  no  control  is  possible  in  the  failed  form  (i.e^ 
b  (2 )  =*0 )  , 

Vw21  '  \  V2)  ■ 

We  will  consider  three  values  of  the  failure  probability  p  here: 


Case 

_A: 

No  failures  possible 

Case 

_B: 

"u--001 

Case 

_C: 

P12-.0O2 

in  order  to  examine  the  effects  of  small  errors  in  the  modelling  of  P^. 

If  there  is  no  chance  of  failure  (Case  A)  then  the  optimal  LQ 
control  slowly  drives  x  towards  zero  (less  than  4%  reduction  in  4  time 
intervals).  The  optimal  costs,  control  gains  and  closed  loop  dynamics 
(for  r~l)  in  this  case  are  given  by  Table  3.7. 


VbiV1) 

k=N-l 

. 99099 

. 00990099 

. 99099 

k»N-2 

1,951267 

.0195126 

.9804874 

k=N-3 

2.8666641 

.028666 

.971334 

k=N-4 

3.7227191 

.0372271 

.9627729 

Table  3.7: 

Example  3.3 

JLQ  controller  in 

form  r=l,  under 

Case  A  (P  *0). 
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If  there  is  a  small  nonzero  failure  probability  (Case  B  p^2=*001» 
Case  C  p^2=.002)  then  the  optimal  JLQ  controller  drives  to  zero 
almost  completely  in  the  first  time  step,  as  shown  in  Table  3.8.  Thus 
a  small  difference  in  the  value  of  P  ^  here  makes  a  large  difference  in 
the  optimal  controller  only  if  the  difference  changes  the  form  transition 
structure  of  the  system  ((Case  A  vs.  Case  B)  but  noc  (Case  B  vs.  Case  C)). 


*k(1> 

t,ku> 

VbiLk(1) 

k=N-l 

99.990001 

.9999 

9. 999x1 0_ 5 

k=N-2 

99.990002 

.9999 

9. 99801x10"  5 

k=N-3 

99.990002 

.9999 

9.99801x10“ 5 

k=N-4 

99.990002 

.9999 

9. 99801x10“ 5 

(a)  Case  B: 

p12“.ooi  , 

b (2 ) =0 

K.  (1) 


L.  (1) 


a.-b.L.  (1) 


=N-1 

99.995 

.99995 

4.999975x10 

=N-2 

99.995 

. 99995 

4. 99951xlo“5 

=N-3 

99.995 

.  99995 

4. 99951x10" 5 

-N-4 

99.995 

.99995 

4. 99951x10“ 5 

(b)  Case  C  : 

Pl2=.°02  . 

b  (2)=0 

Table 

3.8:  Example  3.3 

JLQ  controller 

in  form  r=l  with 

b(2)=0  and 

(a)  p.  -.001  (b)  p.  =.002. 

Now  consider  what  happens  when  the  wrong  controller  is  used  in  the 
above  cases,  where  xQ=l  and  rQ=l. 

If  the  true  value  is  P12=0  P12=-001  controller  is  used  then 

uQ  »  .999002 
x1  -  9. 98xl0-4 

and  the  achieved  cost-to-go  is  around  99-801,  or  about  twenty-six 
times  greater  than  the  cost  with  the  correct  (p  =0)  controller. 

If  the  time  P12=.001  but  the  P12=0  controller  is  mistakenly  used, 

then 

xi  -  .9627729 

E{x2>  -  .9352016 

E{x3)  .9188873 

EfoJ  -  .9087535 

4 

and  the  expected  cost-to-go  is 
346290,67 

which  is  around  3400  times  greater  than  what  the  correct  controller 
obtains . 

In  general,  sensitivity  to  small  parameters  can  be  expected 

if  the  closed- loop  costs  are  very  different  in  the  different 
forms  and  if  a  small  change  in  the  form  transition  probabilities  alters 


the  form  chain  structure  (probabilities  very  near  zero  and  one) .  Changes 
in  the  controllability  structure  are  reasonable  in  models  of  failure- 
prone  systems.  Different  cost  structures  for  failed  and  unfailed  forms  are 
also  appropriate?  for  example,  a  system  may  use  expensive  back-up  equipment 
when  failures  occur.  The  example  system  above  is  an  extreme  case  which 
illustrates  some  of  the  issues  that  arise  in  deriving  general 
theoretical  results  concerning  JLQ  systems. 


3.5  The  Steady-State  Problem 

In  this  section  we  consider  the  JLQ  Markovian  form  control  problem 
(3.1) -(3. 5)  when  all  parameters  are  time- invariant  and  the  time  horizon 
CN=— kQ)  becomes  infinite. 

We  wish  to  minimize 


lim  E  ( 
(N-kQ)-*» 


Jk.  [“^‘VVW'WVi]*  WV’H. 


subject  to 


*b°J 

(3.12) 


*k+l  *  i(rk>Xk  +  B(rk)\ 

"iVl'^V1!  '  Pli’J) 

*(V*0  r(V*r0  • 


(3.13) 

(3.14) 


From  Proposition  3.1  we  have  that  the  optii-al  control  laws  are 


Vi'Vi-vr11  -  "Lk-i(j)xk-i 
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with  optimal  expected  costs-to-go 


Vwjl  =  V3lJck 

where  for  each  j  e  M,  and  K-=n-1,N-2,  . . . 

f  M 

lR(j)  +  B'  (j) 

Vj)  ' 


M 

’  Q  (i) 

l 

4* 

.1=1 

-  v  H 

i-l 


B(j) 


"  M 

■Qd)  ‘ 

B'  (j) 

I  p(j»i) 

4. 

A  ( j ) 

.1=1 

-Vi111- 

- 

and 


V” 


A'  (j) 


M 

l  P(jrD 
i*l 


with 

yj>  *  yj> 


The  optimal  closed- loop  dynamics  in  each  form  j  e  M  are  thus 


Vi  ■  V31** 


where 


L  '  J  Li=1  Ji\V1(i)/JBj/  J 

(3.17) 

Before  stating  the  main  result  of  this  section,  we  recall  the 
following  terminology  pertaining  to  finite- state  Markov  chains: 

.  A  state  is  transient  if  a  return  to  it  is  not  guaranteed. 

.  A  state  i  is  recurrent  if  an  eventual  return  to  i  is 
guaranteed.  If  the  state  set  is  finite,  the  mean  time 
until  return  is  finite. 

.  state  i  is  accessible  from  state  j  if  it  is  possible  to 
begin  in  j  and  arrive  in  i  in  some  finite  n^jmber  of 
steps. 

.  states  i  and  j  are  said  to  communicate  if  each  is  accessible 
from  the  other. 


.  A  communicating  class  is  closed  if  there  are  no  possible 


.  A  Markov  chain  state  set  cam  be  divided  into  disjoint  sets 

T,  C, , . . . ,C  ,  where  all  of  the  states  in  T  are  transient, 

—  Is  — 

and  each  C ,  is  a  closed  communicating  class  {of  recurrent 

1  J 
states ) . 


Define  the  cover  of  a  form  j  6  M  to  be  the  set  of  all  forms 

accessible  from  j  in  one  time  step.  That  is, 


C.^  =  {i€M:  p(j,i)=fo}  . 


The  main  result  of  this  section  is  the  following: 


Proposition  3.2 

Consider  the  time- invariant  Markoviam  JLQ  problem  (3. 12)- (3. 14) 
Suppose  that  there  exist  feedback  control  laws 


*  -Fi*k 


for  each  iSM 


such  that  the  following  conditions  hold: 

(1)  For  each  absorbing  form  i  (ie:  p  ^«1)  the  (deterministic) 

cost-to-go  from  (x^=x,r^=i)  at  time  k  remains  finite 

(for  amy  finite  x)  as  (N-k)-*-«°.  This  is  true  if  and  only  if 


°°  » 

Y  (A.-B.F.)  t(Q.+F!R.F.) (A.-B.F.)t  < 

i  ii  *i  i  i  i  i  ii 


(each  element  finite) . 


See  [  36  ] ,  p.53 
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(2)  For  each  closed  communicating  class  (having  two 

or  more  members)  the  expected  cost-to-go  from 

(x^=x,  rk»j  6  C J  at  time  k  remains  finite  (for  any 

finite  x  and  each  i  6  C^)  as  (N-K)-*  °°.  This  will  be 

true  if  and  only  if  for  each  such  class  there 

exists a  set  of  finite  positive-definite  nxn  matrices 
i}  satisfying  (3.19): 


Z.  =  (1-p.  .)  7  ptT1CA.-B.F. ) 

l  ii  t“1  ii  ill 


’t 


Q.+F.R.F. 
1  ill 


E  pa 


leC .  , 

D  l”P*  i  I 

■-JM  i  J 


(A.-B.F.) 
1  ll 


(3.1 


for  all  i  6  C.  . 


(3) 


For  each  transient  form  i  €  T  C  M,  the  expected  cost- 
to-go  until  the  form  process  leaves  T  (that  is,  until 
a  closed  communicating  class  is  entered)  is  finite. 

This  is  true  if  and  only  if  there  exist  finite  positive- 
definite  nxn  matrices  {G^,...,Gt)  satisfying  (3.20): 


G. 

i 


u-pu> 


t=l 


t-l„  _  _  .  '  t 

Pit  (Al-BiF1) 


‘Q.+F.R.F. 
1  ill 


ieT 


U 


L  Vi  1_Pii 


(A  -B.F. ) 
ill 


for  all  i  e  T 


(3.2 


The  existence  of  feedback  laws  satisfying  these  conditions  is 


necessary  and  sufficient  for  the  solution  of  the  set  of  coupled 
matrix  difference  equations  (3. 15)- (3. 16)  to  converge  to  a  unique 
constant  steady- state  set 


{K(j)>  0;  j  €  M} 


C3. 


as  dJ-kg)-*  «,  given  by  the  M  coupled  equations 


Ko) 


A'. 

J 


.1*}  ill1  1 

L1-1  \K(i)^  3 


(ir 


t.P  ) 

3Li=l  3  \K(i)/J 


,R. 

+3 

-1 

"  S  /J1  \ 

"  “  ( 

1  B. 
2 

1  PjiUft) ) 

Li«*i  3  ^  /J 

B. 

J 

B'. 

2 

jM 

'Qi  > 


(3 


for  j  e  M.  The  steady- state  optimal  control  laws 
\  ■  -Lj*k 


have  time- invariant  gains  {l^ :  j  €  m}  given  by 


L.  » 

j 

p 

«  1 

■ai  \1 

i 

H 

W 
u-  - 

Bj 

Li-iPji' 

+  / 

V K(i)/- 

B. 

dJ 

j  L 

i-1  3  'K(i)  'J  3 


(3 


and  minimize  (3. 12)- (3. 14)  with 


vk0(Vro}  =  xiK{ro)xo  "  - 


(3 


for  x'x^  <  ®. 


When  the  steady-state  optimal  control  laws  (3,23)- (3.24)  exist,  they 


stabilize  the  system  in  the  sense  that 


as  (k-kg)-*-  «°,  and  K(j)>  0  for  each  j  e  M  if. 

(4)  for  at  least  one  form  i  in  each  closed  communicating 
subset  of  M,  the  null  spaces 


h(qY2)  nnd^) 


{0} 


(3.25) 

□ 


The  conditions  (2)- (3)  take  into  account 

.  the  probability  of  being  in  forms  that  have  unstable 
closed  loop  dynamics 

.  the  relative  expansion  and  contraction  effects  of 
unstable  and  stable  form  dynamics,  and  how  the 
eigenvectors  of  accessible  forms  are  "aligned." 

That  is,  it  is  not  necessary  or  sufficient  for  all 
forms  to  be  stable,  since  the  interaction  of  dif¬ 
ferent  expected  form  dynamics  determines  the 
behavior  of  e{x^_  x^} . 

This  will  be  illustrated  in  the  examples  of  this  section.  The  conditions 
in  Proposition  3.2  differ  from  those  of  the  usual  discrete-time  linear 
quadratic  regulator  problem^-  in  that: 

.  necessary  and  sufficient  conditions  (l)-(3)  replace  the 
sufficient  condition  that  the  (single  form)  system  is 
stabilizable 

^ee,  for  example  (38],  p.  497. 
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.  condition  (4)  replaces  the  assumption  that  the  (single 

1/2 

form)  pair  (A,  Q  )  is  detectable. 


Unfortunately  conditions  ( 1) —  (4 )  axe  not  easily  verified.  There  is 
no  evident  algebraic  test  for  (3.18)- (3.21)  like  the  controllability 
and  observability  tests  in  the  LQ  problem.  The  use  of  the  conditions 
in  Proposition  3.2  will  be  demonstrated  in  examples  later  in  this 
section. 

The  proof  of  Proposition  3 . 2  has  the  same  basic  outline  as  in 
the  LQ  problem: 


(i)  First  show  that  conditions  (l)-(3)  guarantee  that 

with  zero  terminal  costs  {k  (j)=0;  j  6  m},  the 

N 

sequence  of  positive  semidefinite  symmetric  matrices 


(ii) 


{K.  (j)}  (for  each  j  e  M)  in  (3.16)  is  increasing 

*0 


and  bounded  above  as  (N-k^)  increases  and  hence  the 

K.  (j)  converge  element  by  element  to  bounded  matrices 
*0 


lim  K.  (j)  ■  K(j) 

(N-kQ)-*»  K0 

Then  (3.15)- (3.16)  yield  the  steady-state  values 
(3.22)- (3.23)  and  the  costs 

x0  K(j)xo  r0  “  j  6  M 

are  finite  for  finite  xQ. 

Condition  (4)  is  then  shown  to  guarantee  that  e{x^'x^ 
goes  to  zero  as  (k-kQ)  becomes  large,  and  that  K(j)>0 
for  each  j  8  H. 
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(iii)  Next  it  is  shown  that  these  results  hold  for 


arbitrary  finite  symmetric  terminal  cost  matrices 
{KT(j)>  0  '  j  6  M}  . 

(iv)  Finally  it  is  easily  shown  (by  contradiction)  that 
the  {k ( j ) ,  j  6  m)  are  the  unique  positive  definite 
solutions  of  (3.18). 

Once  (i)-(ii)  are  proved  then  (iii)-(iv)  are  easily  established.^ 
Step  (ii)  is  proved  in  Appendix  B.2.  Note  that: 


Corollary  3.3:  The  null-space  requirement  in  condition  (4)  of  pro¬ 
position  3.2  is  satisfied  if,  for  at  least  one  form  i  in  each  closed 

communicating  subset  of  M, 

Q±  >  0  • 


The  difficult  part  of  proving  Proposition  3.2  is  establishing  that 
conditions  (l)-(3)  have  the  desired  effect.  Equations  (3.18)-(3.20) 
follow  by  a  direct  application  of  dynamic  programming.  The  cost-to-go 
from  (x  ,  r  =i)  if  i  is  an  absorbing  form  is 


l  (A  -B.F,),fc(Q.+F!R.F.)  (A.-B.F)1  ) 

Ht«o  111  1  111  1  11  ) 


(where  control  law  gain  -F^  is  used),  hence  (3.18). 


municating  class  C, ,  the  expected  costs-to-go  from 


*k 

For  a  closed  com- 
(x  ,r  =i)  for  each 


For  the  absorbing  form  r=6,  03,18}  yields 


l  aO^QOejaOSX  • 
t*0 


Hence 


C 


Thus  we  have  condition- 


Ci)  a  (6)<  1 


For  the  closed  communicating  class  {3,4},  (3.19)  gives  coupled 
equations 


Z3  -  a(3)[Q3+Z4]  a  (3) 


z4  *  aC4HQ4+Z3]  a  04) 


Plugging  in  for  Z4  in  the  first  equation  yields 


Z  »  -  IQ  (3)  +  a2(4)Q04)] 

lV031a^04) 


Z  *  a2 (,4-"-2 -  IQ (4)  +  a2(3)Q(3)]  , 

1-a  (3)a  (4) 

Thus  for  Z3 ,  Z4  positive  we  have  condition 
(ii)  a2(3)a2(4)<  1 


2 


For  the  transient  forms  {1,2, 5,7},  (.3.20)  yields 
G  *  atl)  IQ  Cl)  +  G2  J  a  (1) 

G2  *  a (2) IQ (2)  +  P21G13  a (2) 

s  -  (1-P„>  I  #  a2t(7,[aC7,  *  if- 
'  t»l  1  P77 

00 

G5  =  C1*P55)  l  ^55 1  a2t(5)Q(5)  . 

Now  for  0  <  G5  <  00  we  have  the  condition 
(iii)  P55  a2 (5)<  1 


with  the  resulting 


G5  = 


Q (5)a  (5) (1-P55) 
1-P55a2(5) 


We  find  from  the  G^  and  G 2  equations  above  that 

r  =  a2  (1)  [Q  (1)  +  a2  C1)QC2)3 

^1  2  2 
A  1-a  Cl)  a  (2) P21 

a2 (2)  [Q(2)  +  a2(l)P21Q(l)] 

G  =  2  2 

1-a  (l)a  (2)p21 


so  for  0  <  G^,  <  00  we  have  conditions 


(iv)  a2(l)a2C2)p21  <  1  . 


Finally  we  find 


2  P72G5  *  /  2 

G 7  -  a  <7)(l-p  )  Q (7)+  £  a  (7)p 

7  77  1-p77  t-l\ 


so  for  0  <  G^  <  00  we  have  condition 


(v)  a  (7)P77  <  1 


a  (7)  (1-P„) 


1-a  C7)P__ 


P^a2(2)  lQ(2)+a2  (1)P_,Q(1)  ] 


(l-P77)Cl-a  (l)a  (2)P21) 


Thus  (i)-(v)  are  the  necessary  and  sufficient  conditions  of 
Proposition  3.2.  For  this  example  we  see  that 

.  The  absorbing  form  r=6  must  have  stable  Systran 
dynamics  (i) 

.  one  of  the  forms  in  the  closed  communicating  class 
{3,4}  can  be  unstable  as  long  as  the  other  form’s 
dynamics  make  up  for  the  instability  (ii) 

.  transient  forms  r=5,7  can  have  unstable  dynamics  as 
long  as  the  probability  of  staying  in  them  for  any 
length  of  time  is  low  enough  (iii),  (v) 
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Both  forms  have  stable  systems  (eigenvalues  1/2,  1/2)  and  hence  are 
stabilizable.  However 


*k+2  = 


(100,25 

5 


if  r,  =1 
k 


(.2S  5  \ 

V2'ls  100. lf  V2 

which  is  clearly  unstable.  Thus  x^  and  the  expected  cost  (3.12) 
become  infinite  as  (N-k^)  goes  to  infinity. 

In  fact,  controllability  in  each  form  is  not  sufficient,  as 
demonstrated  below. 


Example  3.6;  Controllability  not  sufficient  for  finite  cost 
Let  M=2  where 


Thus  in  each  form  (r=l,2)  the  system  is  controllable,  and  the  closed- 
loop  systems  have  dynamics 

Vi  "  Yk  Yj 
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where 


where  f^,  f f  axe  determined  by  the  feedback  laws  chosen 


Now  suppose  that  we  have  a 

"flip-flop" 

example: 

P  =  P  =0 

11  22 

• 

P  -  P  =  1 

21  12 

Then 

*2k  "  (D2Dl,k  *0 

if 

*2k  '  <DlD2,k  X0 

if 

where 

/f  lf4 

2£  +f  f 

3  2  4 

D  D  =  1 

2  1  \ 

\  0 

4 

3D  =  I 
12 


) 


I 


1 

1 

I 


Both  D^D2  D2°l  ^ave  ^  as  an  eigenvalue.  Thus  x^  grows  without 

bound  for  x^O  as  k  increases.  Controllability  in  each  form  allows 
us  to  place  the  eigenvalues  of  each  form's  closed  loop  dynamics  matrix 
(D^)  as  we  choose,  but  we  cannot  place  the  eigenvectors .  In  this 
example,  there  is  no  choice  of  feedback  laws  that  can  align  the  eigen- 
structures  of  each  of  the  closed  loop  systems  so  that  the  overall  dynamics 
are  stable.  The  following  example  demonstrates  that  (for  n>2)  sta- 
bilizability  of  even  one  form  is  not  necessary  for  the  costs  to  be  bounded 
above. 


Both  forms  are  unstable,  uncontrollable  systems  so  neither  is  sta- 
bilizable.  We  again  take 

*The  closed- loop  systems  are  stable  if  and  only  if  the  moduli  of  each 
eigenvalue  is  less  than  one.  See,  for  example,  [38]  p.  454. 


and  plugging  this  into  the  second  equation: 


zuC2) 


zuu> 


2  +  IZU<2) 


-  +  -  Z  (2) 
2  4  12  1 


221(2)  Z22(2) 


t4Z21(2> 


2WZ22(2>' 


This  yields  four  equations  in  four  unknowns.  Solving,  we  find 


Zll(2)  Z12(2) 


2/3  2/3' 


Z21(2>  Z22(2) 


2/3 


and  thus 


Zll^  Z12^1^ 


5/3  -4/3 


Z21(1)  Z22(1) 


-4/3 


5/3 


which  are  both  positive  definite.  Thus  Z^  and  Z 2  satisfy  (2)  of 
Proposition  3.2. 

We  can  obtain  sufficient  conditions 

that  replace  the  necessary  and  sufficient  conditions  ( 1 ) — (3 )  in 
Proposition  3.2,  and  are  somewhat  easier  to  compute,  in  terms  of 
the  singular  values  of  certain  matrices.  For  any  matrix  A, 

1/2 

| |a|  |  *  [max  eigenvalue  A' A] 


max  singular  value  of  A 


(3 


Note:  In  the  above,  | | A | |  is  the  spectral  norm  of  A,  defined  as 


| |a| |  =  max  { | |Au| | }  (3.27) 

I  Ml-i 

over  all  vectors  u  of  unit  length  where  |  |  • •  •  |  |  on  the  right  in 
(3.27)  designates  the  ordinary  euclidean  norm  of  a  vector 


Corollary  3.4:  Consider  the  problem  of  Proposition  3.2.  Sufficient 
conditions  for  the  existence  of  the  steady-state  control  law  (and 
finite  expected  costs-to-go) ,  replacing  (1) — (3 ) ,  are: 

there  exist  feedback  control  laws 
\  "  'FiXk  i  e  M 

such  that 

(1)  for  each  absorbing  form  i  (p^=l)  > 

00  2 

l  !MvBiFi)tH  <w  (3-28) 

t**o 

(2)  for  each  recurrent,  nonabsorbing  form  i 

00  2 
(1'Pii}  l  Pii  I i (Ai"BiFi)t| I  <  c  <  1 

t=l 


n 

l 

i=l 


1/2 


(3.29 


(3)  for  each  transient  form  i  S  I  that  is  accessible  from  a 
form  j  e  1,  in  its  cover  (j^i) 


l  <  =  < 1 

fc=l 


.  1 1  i  2 


(3.30) 

and  for  each  transient  form  i  6  T  that  is  not  accessible  from 


any  form  j  e  a,  in  its  cover  (except  itself) : 


(1-P..)  I  P^1  |  |  (A.-B.F.  )fc|  ]2  <  oo  . 
ii  ii  11  i  i  i  11 


(3.31) 


The  proof  of  this  Corollary  is  given  in  Appendix  B.3.  A  similar  result 
for  continuous-time  systems  is  obtained  by  Wonham1  (76  ] ,  except  that 
stabilizability  and  observability  of  each  form  is  required,  and  a 
condition  (3. 29)= (3.30)  is  required  for  all  nonabsorbing  forms. 

Condition  (3)  is  motivated  as  follows.  The  cost  incurred  while  in 
a  particular  transient  form  is  finite  with  probability  one  since, 
eventually,  the  form  process  leaves  the  transient  class  T  and  enters  a 
closed  communicating  class.  If  a  particular  transient  form  i  e  T 
can  be  repeatedly  re-entered,  however,  the  expected  cost  incurred  while 
in  i  may  be  infinite;  (3.30)  excludes  such  cases.  Note  that  the  suf¬ 
ficient  conditions  of  Corollary  3.4  are  violated  in  example  3.7  (in  both 
forms).  This  demonstrates  that  they  are  restrictive,  in  that  they  ignore 


1 


Theorem  6.1,  p.195  of  [76 


]. 
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the  relative  "directions"  of  x  growth  in  the  different  forms 
(i.e.  the  eigenvector  structure).  We  consider  next  a  sufficient 
condition  that  is  easier  to  verify  than  (l).-(3)  of  Corollary  3,4, 
but  more  restrictive. 

Corollary  3.5;  Sufficient  conditions  (1).-C3).  in  Proposition  3,2 
can  be  replaced  by  the  following; 

Por  each  form  i  €  M,  there  exist  feedback  control  laws 

\  =  -pixk 

such  that 

I Ia.-B.F. I  I  <  c  <  1  (3.32) 

''ill11 

Proof;  If  this  condition  holds,  then  with  these  we  have  (with 
xQ  finite) 

j  00 

E  £  Xk  Q(rk)xk  +  “k  R(xk)uk 

(  k=0 

<||x0|j  /max  ||Q.  +  F^.F.lljmax  £  |  |  A. -B.F.  |  | 2k 

\  3  /  i  k-0 

00 

r  2k 

£  (constant)  l  c  <  “>  r 

k*0 

since  c  <  1. 

Note  that  if  (3.32)  holds  then  conditions  (l)-(3)  do.  Note  also 
that  we  are  guaranteed  that  |  |x^|  |->-  0  with  probability  one,  if  (3.32) 
holds  only  for  recurrent  forms.  However  this  is  not  enough  to  have 
finite  expected  cost,  as  demonstrated  in  the  following  examples. 


min  | iA1*B1p1l ! 


c  :)l 


a  >  1 


min  !  A  -■  F  | |-  0 

P2 


and  for  r  *1  and  |  jxJ  I  finita 


E  |  Jo  xk8<t>‘”‘'‘  * 


J  k  2k  i i  ,.2 

Z  p  a  I l*J I 


l*J!2  l  <a2p)k 


a2p  <  1 

than  the  expected  cost  is 


l-a2p 


but  if  a  p  >  1  then  the  expected  coat- to-go  is  infinite.  This 


demonstrates  that  (3.32)  holding  only  for  nontransient  forms  is  not 


sufficient. 


^ 


-  _”_*.  •.*  *  ■  * — " t  s'  \k  — _7 — r ^ — T" — 7*- 


Example  3,9:  Let 


*k+i  * 


if  r  =1,3 
k 


Vi 


if  r,  =2 
k 

(a^O) 


where 


If  the  system  is  in  form  1  for  three  successive  times 
(rR  =  rk+1  =  rk+2  =1) ,  then  x^+2  =  (00)  for  arr£  xk-  The  same  is 
true  for  three  successive  times  in  the  absorbing  form  r=3. 

In  form  r=2,  the  expected  cost  incurred  until  the  system  leaves 
(at  time  T)  given  that  the  state  at  time  k  is  (xk,rk=2)  is 
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!  l 
(t=y 


Q  x 
*2  t 


(1-p22) 


t=l 


t-1 

D 

'22 


(a;} 


q2a 


2  ^ 


For  this  cost  to  be  finite  we  must  have 


<1-P22)  j/221  ^  «2*2 


=  V  (1‘p22)  l  (P22a  5  <  00 

t=0 


which  is  true  (for  Q2  finite)  if  and  only  if 


a  P22  <  lm 

Thus  we  would  expect  that  the  optimal  expected  costs-to-go  in  Proposition 
3.2  will  be  finite  if  and  only  if 


a  p22  <  1  ♦ 


Let  us  verify  that  the  necessary  and  sufficient  conditions  of  Proposition 
3 . 2  say  this . 

From  (3.18),  for  absorbing  form  r=3 
A^CQJA*  <  « 

t»0 


but 


t  /°  °\ 

A,  -  (  for  t  >  2  . 

\0  0/ 


so  this  condition  is  met. 


Por  transient  forms  {1,2}  we  must  have  0  <  G^,  G^  <  00 


where 


;i  “  JL  ^l1  (Ai}t 


Q  +  — - ~  G 
1  1_pll  2 


S  -  <>V  tlx  ^  ^  a2A2 


Now 


(A2> 


A2=l 


'a.  0 


0  a 


thus 


«,  ■  X (p^2,t 


hence  we  have  condition 


(i)  a  *22  <L 


and  G2  =  Q2a2(l-P22)/(1-P22a2).  Finally  since 


t  /°  «\ 

(A.)  - 


for  t>2, 


we  have 


'  /  P,,a  (1-p  ) 

(1“P11)  Ai(  21  +  - - ^—2 

l  \  {1-pU)C1-P22a  > 


Q2  A1  ' 


which  is  positive-definite  since  Q^,  Q2  >  0-  Thus  the  necessary 
and  sufficient  conditions  of  Proposition  3.2  here  reduce  to  (.i)  , 
as  we  deduced  earlier.  Note  that  the  sufficient  condition  (3.32) 
of  Corollary  3.5  is  never  met  for  r=l,  r=3 


A  =  X 
1  3 11  I  max 


2  2 


2  2 


=  2  >  1. 


and  to  meet  (3.32)  for  r=2  requires 


Ia2I  I  =  1^1  <  1  =>a2  <  1  . 


However  the  sufficient  conditions  for  Corollary  3.4  are  met  because 
forms  {l,2}  are  'non-re-enterable'  transient  forms  satisfying 
(3.31)  (if  aV.,  <  1  for  r=2). 


II 


3.6  Summary 

Let  us  consider  the  JLQ  controller  here  in  terms  of  the  fault- 
tolerance  criteria  of  section  1.2.  We  note  that  the  controller 
(3.7)-(3.8)  is  clearly  adaptable  (in  the  terminology  of  section  1.2), 
since  a  different  control  law  is  used  in  each  form.  That  is,  when 
a  failure  or  other  structural  change  occurs  it  is  instantaneously 
detected  (by  assumption)  and  this  information  is  used  to  reorganize 
the  controller. 

Passive  hedging  (.the  taking  into  account  of  possible  future 
form  changes  and  associated  costs)  is  accomplished  via  the 


M 


IiPk(j>i)[Qk(i)+Kk(i)] 


terms  in  (3.7)- (.3.8) .  There  is  no  active  hedging  possible  in  this 
problem  formulation  because  the  form  transition  probabilities 
cannot  be  controlled.  With  regards  to  the  implementability 
attribute  of  fault- tolerant  controllers  of  section  1.2,  the 
precomputable  nature  of  (3.7)  —  (3.8)  should  facilitate  the  use  of 
this  controller  if  M(N-k^)  (the  number  of  gains  that  must  be  computed 
and  stored)  is  not  too  large.  When  the  steady-state  controller  of 
Proposition  3.2  exists,  a  set  of  M  optimal  steady-state  gains  that 
can  be  used  in  place  of  the  M(N-kQ)  gains;  this  certainly  should  sim¬ 
plify  implementation. 
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While  in  each  form,  the  optimal  JLQ  controller  of  Proposition 
3.1  is  endowed  with  robustness  properties  derived  from  the  linear 
quadratic  problem.  However  the  JLQ  controller  may  be  extremely 
sensitive  to  small  errors  in  the  modelling  of  form  transition  pro¬ 
babilities,  if  the  probability  in  question  is  close  to  zero  or  one 
and  if  the  controllability  of  the  dynamics  changes  between  forms, 
as  illustrated  in  example  3.3. 

Proposition  3.2  provides  necessary  and  sufficient  conditions 
for  existence  of  the  optimal  steady- state  JLQ  controller.  These 
conditions  are  not  easily  tested  for  nonscalar-x  problems,  however 
since  they  require  the  simultaneous  solution  of  coupled  matrix 
equations  containing  infinite  sums.  In  Corollaries  3.4  and  3.5 
sufficient  conditions  that  are  based  upon  singular  values  are  presented 
that  are  somewhat  more  testable  for  some  problems.  However  the 
derivation  of  easily  calculable  conditions  for  the  JLQ  steady  state 
problem  (like  the  controllability  and  observability  conditions  of  the 
LQ  problem)  remains  an  open  question. 
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4.  EXTENSIONS  OF  THE  X- INDEPENDENT  JLQ  PROBLEM 


In  this  chapter  we  develop  two  extensions  of  the  JLQ  problem 
formulation.  Our  purpose  is  to  indicate  how  the  ideas  and  results 
of  Chapter  3  cam  be  applied  to  more  general  problems.  We  will  con¬ 
sider  here  only  problems  with  form  processes  that  are  not  explicitly 
x-dependent.  The  more  difficult  cases  of  x  and  u  dependent  forms  are 
the  subject  of  Parts  III  and  IV  of  the  thesis. 

In  section  4.1  we  consider  JLQ  problems  with  additive  input  noise 
to  amd  noisy  observations  of  the  x  subsystem  (but  perfect  observations 
of  the  form).  As  in  the  LQ  problem,  a  separation  result  holds.  The 

only  complication  is  that  the  parameters  of  the  estimator  of  x  (from 

k 

noisy  observations)  depend  upon  r  ,  and  thus  cannot  be  compute. 

k 

off-line. 

In  section  4.2  we  widen  the  range  of  physical  situations  that  can 
be  captured  by  the  JLQ  control  problem  by  including  in  the  problem 
formulation  jump  costs  and  x-resets  when  the  form  changes. 


4.1  The  JLQ  Problem  with  additive  input  and  x-observation  noise 


In  this  section  we  extend  the  JLQ  problem  of  Chapter  3  to  include 
additive  white  input  noise,  and  we  assume  that  a  linear  function  of 
is  observed  at  each  time  k  in  the  presence  of  additive  white  noise. 

Under  the  crucial  assumption  that  the  form  process  is  perfectly  observed 


at  each  time  k,  the  optimal  control  law  is  the  same  as  in  the  noiseless 


case  but  it  acts  upon  an  estimate  of  the  x  process.  This  estimate 
is  obtained  by  a  Kalman  filter,  where  the  update  parameters  are  deter¬ 
mined  at  each  time  by  the  observed  form  value. 

We  sure  considering  the  discrete-time  jump  linear  system  with 
additive  driving  noise: 


Vl  *  A<rk):,k+  S<rk,Tk 


(4.1) 


fr{rk.i=:i|rk=il 


i,j  e  m 


where 


x(k0)  =  xQ 


r(V  =  ro 


p.  .  >  0 

13  ~ 


M 

I  P--=l 
j-1  13 


Vi,  j  6  M 


M  =  {1,2, .. .,M> 


—  „n  _ _n 

xk  €  R  \e  R 


k=kQ,  kQ+l,...,N 


At  each  time  we  observe  r.  perfectly  and  a  linesu:  function  of 

)v 

x^  contaminated  by  white  observation  noise: 

‘  c(rk’*k  *  D(rk)uk  *  A(rk,uk  •  (4-2) 


In  (4.1)- (4.3),  v^€  R'"  and  w^S  R  .  The  input  noise  sequence  {v^} 

and  observation  noise  sequence  {w^}  are  white,  Gaussian  with 


E{vk}=° 

Etvk  vi>  - 

j° 

C 

(l 

l=k 

E{wk}=0 

E{“k  “i}  ‘ 

1° 

(l 

a- k 

V£,k 


and  are  independent  of  each  other,  of  the  form  sequence  {r  } 

k 

and  the  initial  condition  xQ.  Here 


E(V  “  *0 


E{  >  VV 1 W ' 1  - 


and  x  is  independent  of  the  (deterministic)  r. 


We  seek  to  minimize  the  cost  criterion 


Jk0(Vro>  ‘ 


l.Z.  “kR<rk)uk  *  *lUl8trk*l)l‘k+ll 
E  ° 

\  +  slrw'Vi  +  P<W  J 


+  x’K  (r  ) x  +  H  (r  ) x  +  G  (r  ) 
NTNN  TNN  TN 


(4.3) 


Note  that  we  have  x^  and  x^  terms  in  (4.3).  These  are  included 
here  for  later  comparison  with  the  x-dependent  form  problems  in 
Part  III  of  this  thesis.  In  (4.1)- (4.3)  we  take,  for  each  i  €  M, 


R(i)  >  0 


'Q(i)  S' (i)/2 
,S(i)/2  P(i)  , 


>  0 


K  (i) 
T 


Hj(i)/2 


>  0 


H  (i)/2  G  (i) 
T  T 


>  0 


The  term 


(4.4) 


x'K  (r  ) x  +  H  (r  )x  +  G_(r  ) 
NTNN  TNN  TN 


in  (4.3)  is  a  terminal  cost  charged  in  addition  to  the  time- invariant 
cost 

VS'VS  *  slrs)xH +  PV  • 

The  control  problem  is  then  to  find  the  control  law 

=  ik(y (kQ) , . . . ,y (k) ;  r (kQ) , . . . ,r (k) )  (4.5) 

that  minimizes  (4.3).  As  in  the  linear  quadratic  Gaussian  (LQG) 
problem  the  optimal  solution  to  this  problem  satisfies  a  separation 


principle.  In  particular  we  have  the  following: 


The  stochastic  JLQ  problem  with  incomplete  and  noisy  measurements 


as  in  (4.1) -(4. 5)  has  the  following  solution.  The  optimal  control 
law  is  given  by 


Vi'Vi'Vi'31  =  *  Pk-1(3)  . 


The  control  law  parameters  in  (4.6)  are 


V3>  * 


Rt  j ) 

+ 

LB’ 


(4.6) 


B’ (j)Kk+l(j)A(j)  (4.7) 


Vj) 


1 

2 


R(j) 

+ 

01  (j)  Vi^)B(j)_ 


-1 


0,(j)Hk+i(j)  (4.8) 


for  k=N~l,  N-2,...,kQ  and  j  g  M  where  j) ,Fk(j) )  are  computed 
recursively,  backwards  in  time  from  k=N,  by  the  following  sets  of 
M  coupled  matrix  difference  equations: 


A(j) 

(4.9) 


- 

'  R(j) 

-1 

V3>  -  *'<3’W3> 

I-B(j) 

to 

+ 

B' (j)Kk+1(j)B(j) 

B,(3)Vi(3) 

r 

r 

1-1 

*1 

V3)  -  Vi(3) 

I-B(j) 

R(j) 

+ 

-1 

B'(3>\+1(3> 

a*  ^)Kk+1(j)B(j)_ 

A  ( j ) 


(4.10) 


where 


\+1W  -  l  ptj.D^ifD+Qd)] 

M 

\+1(3>  ■  1  P(j/i)  [Hk+1(i)+S(i)] 

i=l 

with  terminal  conditions 

yj>  =  yj> 

Vj)  =  HT(j) 

The  optimal  (minimum  mean  square  error)  estimate  x^  in  (4 
given  by  the  following  form-dependent  Kalman  filter: 


x-estimate 

extrapolation 


error  covariance 
extrapolation 


v-> 


A(rk-1)\-1  (rk-l)A'  (rk-l} 


+ 


!<rk-i>  2(rk-l| 


x-estimate 

update 


*k 


*k(-> 


+ 


W 


V^k’V-’ 

-D(rk>Vi 


error  covariance 

update  Vrk}  =  U"rk(rk)C(rk)]'1'k(') 


(4.11) 

(4.12) 

(4.13) 

(4.14) 

6)  is 

(4.15) 

(4.16) 

(4.17) 

(4.18) 


filter  gain 
matrix 


W  =  V-)C,(V 


c(rk)\(')c' (V 


LA(rk)A'  (rfc) 


with  initial  conditions 


(-)  =  x. 


V- 


V’  ■ 


The  optimal  expected  cost- to- go  is 


vk0‘Vro’  ■  *'o\{ro)xo  *  \(lo)xo 


+  G,  (rrt)  +  trtK^  (r^)^] 


V  0 


0'  0 


N— 1 

*  l  tr 

k=k„ 


\(rk,Lk(rk}  Vl^k^^^lric)  r 


where 


4  \+l(j)B(j) 


M 

, 

+  I  p(j»i)tr 

H  (i) 

+ 

H(i) 

k-1 

.Q(i) 

r  R(j> 

1 

-1 

B*  (j)Kk+l(j)B(j) 


B,«)HjUl(j) 


with 

M 

Gk+i(j)  =  l  p(j,i)  [Gk+1(i)  +  P(i)]  (4.24) 

i=l  “  x 

and  terminal  condition 

G  ( j )  =  G  ( j )  j  €  M  .  (4.25) 

NT  — 

At  each  time  k=N,N-l, . . . ,kQ 


H^(j)/2' 


/VJ) 

WJ,/2  V31 


>  0 


□ 


(4.26) 


This  is  proved  in  Appendix  B.4. 

Note  that  the  control  law  is  unchanged  if  there  is  no  obser¬ 
vation  or  driving  noise  (ie.,  S(j)*0,  A(j)=0,  vj  6  M).  That 

is,  the  certainty  equivalence  principle  applies  here. 


e; 


4.2  Jump  Costs  and  Resets 

In  this  section  we  extend  the  range  of  problems  that  can  be  cap¬ 
tured  by  the  JLQ  problem  formulation.  Specifically  we  consider 
problems  where 


jump  costs  are  incurred  when  the  form  changes 

from  r,  ,  to  r,  at  time  k 
k-1  k 


•  the  value  of  the  x  process  may  be  reset 
to  an  affine  function  of  its  current  value 
when  the  form  changes . 

Jump  costs  might  represent  start-up  or  shut-down  costs  of  equipment 
when  the  system  form  changes.  They  might  also  model  undesirable 
transient  phenomena  such  as  load  shedding  costs  in  electrical  power 
systems,  or  the  cost  of  equipment  destroyed  by  the  form  change. 

The  resetting  of  x  allows  us  to  model  failures  that  result  from 
abrupt  changes  to  the  dynamic  state  of  the  system.  For  example, 
phenomena  such  as  failure-caused  biases  in  communication  equipment  or 
rapid  voltage  jumps  due  to  changing  interconnections  in  electronic 
devices  can  be  modelled  by  resets  of  x.  In  addition  we  can  use  resets 
to  represent  nonlinear  systems  as  a  collection  of  linear  systems,  each 
associated  with  a  different  operating  point.  The  x  process  might  re¬ 
present  the  deviation  of  the  state  from  the  current  nominal  value.  If 
we  assume  that  changes  in  the  operating  point  are  caused  by  external 
events  (and  are  not  x-dependent)  then  the  results  of  this  section  can 
be  applied.  The  x-dependent  case  is  treated  in  Part  III. 

Consider  the  following  class  of  jump  linear  systems  with  affine 
resets : 

Vi  '  A(rk,xk  +  Blrk>\  *  S(rk,vk  <4-27) 

pr{wjlv1} 


i,  j  €  M 


(4.28) 


/K TU'j 

\HT(i,j 


a;(i»j)/2\ 

Vi,j)  / 


for  all  i,j  6  M. 

To  find  the  optimal  control  law  we  apply  dynamic  programming. 
We  have 


V  [x  ,r  ,r 1  =  x'K  (r  .  »r„ )x„  +  H  (r  , r  ) x 
NNN—  IN  NT  n- inn  tn—  inn 
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N-l  N-l 


XN-1 

+ 

S(j'rN)XN  +  P(j'V 

vrj 

+ 

'VWi’S-V 

J 

and  for  k=N-2,...,k, 


Vk(Xk'rk=j)=min 


YR(j,uk 


fW'i'Vi’Vi 

J  +  k 

H  s(j'rk+i}Vi +  p(j'W  Vj 


Vk+1 ^  Xk+1 ' rk+l } 


Using  (4.29)  (4.31)  we  can  rewrite  (4.37)  as  V]t(x^,r^=j)  = 

/  / 


p(j  ,i)E 


A(j)3Ck 
A(j,i)  + 

B(j)iak  Q(j/i) 

Jdlvj 

"  +  Z(j,i)  “ 


A(j)x. 

A(j,i)  +  * 

_5(i)\ 
+  Z(j,i) 


S  ( j  ,  i)  [A(  j  ,i)  {A(j)xk+B(j)uk+E(j)vk.}+Z(j,i)  ] 


P(j,i) 


(4.38) 


v^+l [A( j ,i) {A(j)xk+B(j)uk+S(j)vk}+Z(j,i)  ,i] 


Solving  (4.38)  recursively  for  the  optimal  control  sequence 
then  yields  the  following: 


Proposition  4.2:  For  the  problem  (4. 27) - (4. 34)  the  optimal  control 

law  is  given  by 


Vi'Vi'Vi’11  ■  •Ilc-i(i’Vi  +  Fk-i(j) 


(4.39) 


for  k=N,  N-l,...,kQ+l  and  the  optimal  expected  cost-to-go  is 


VVV3)  -  Wk  +  Vj)xx  *  V3’ 


(4.40) 


for  k=N-l,  N-2, . . . ,kQ  for  each  j  €  M,  where  the  parameters  in  (4.39)- 


(4.40)  are  computed  recursively,  backwards  in  time,  by 
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R(j) 

+ 


rl 


1_B' 


B'(j)H-+i(j) 


M  t 

+  l  P(j,i)trj5' (j)A' ( j ,  i) 
i=l  ' 


Q ( j »i) 

+ 

Kk+l(l) 


A(j,i)  5  (j)> 


Vj)  = 


R(j) 

+ 

B' (j)Kk+1(j)B(j) 


-1 


B'  (j)\+1(j)A(j) 


V3)  - 


S(j) 

+ 


-1 


B'<j),W3> 


B'  lj)Vi<j)B<j)J 

as  in  Proposition  4.1,  but  with 


A  M  _  \+lU) 

=  l  p ( j  , i)  A'  ( j  ,  i)  +  A ( j , i) 


(4.41) 


(4.42) 


(4.43) 


(4.44) 


(4.45) 


(4.46) 


Hk+l(j) 


1  P(j»i) 
i=l 


<Wj) 


M 

Y  P ( j  »i) 
i=l 


Z'  (j,i) 


k+1 


k+1 


and  terminal  conditons 


M 


KT(j ,i) 


A(j,i) 


N 


i-1 

LQ(3 

•L)  J 

M 

[j 

r" 

=  l  P(j,i) 

)2Z'  (j,i) 

+ 

+ 

i-1 

( 

L 

r 

r 

+ 

S  (j  ,i) 


M 


GN(j)=  l  P<j.i) 


Z'  (j,i) 


KT(j,i) 


+ 

Q< j  ri) 


Z ( j ,i) 


[H  (j,i)  +  S(j,i)  ]z(j,i) 


[G  ( j , i)  +  P( j,i) ] 


At  each  time  k=N-l, . . .  ,k  , 


V3) 


H^(j)/2 


\(j)/2  Gk(j) 


>  0  . 


Comparing  this  result  with  Proposition  4.1  in  the  noiseless  case 
(ie,  E(j)=0,  A ( j ) =0 ,  Vj  €  M)  we  see  that  the  cost  and  control  laws 
are  the  same1  but  the  definitions  of  K^^tj),  Hk+1(j)  and 
are  different.  Note  that 


•  the  A(j,i)  (linear  reset)  parameters  enter 
into  all  of  the  cost  and  control  law  terms 
(as  do  the  Q( j , i) ' s) 

•  the  Z( j , i)  (constant  reset)  parameters  do  not 
affect  the  linear  gain  of  the  optimal  control 


The  following  example  illustrates  some  of  the  qualitative  effects 
of  jump  costs  on  the  controlled  system's  behavior. 


Example  4.1: 


Consider  the  following  problem: 


Vi  =  \  +  \ 


if  r,  =1 
k 


x.  .  *  2x_  + 


if  r,_*2 


For  deterministic 


min  E  >  ukR(rk)  +  ^  Q<VW 

l  k=ko  ; 


where 


r(1)  =  1  cheap  to  control  when  r  »1 

R(2)  =  1000  expensive  to  control  when 

V2 

If  we  take 

Q(l,l)  =  Q(l,2)«l  Q (2, 1)  =  Q(2,2)=l 

(ie,  no  jump  costs)  then  we  have  the  same  problem  as  example  3.2. 
The  optimal  JLQ  controller  parameters  and  the  closed-loop  dynamics 
for  this  case  are  listed  for  four  time  stages  in  Tables  4.1  and 


4.2,  respectively. 


K=N-1 


.5 


3.996004 


.5 


1.998x10 


K=N-2 

.6490736 

7.384818 

.6490736 

3.67203x10“ 3 

K=N-3 

.6990352 

9.2692147 

.6990352 

4. 60253xl0~3 

K=N-4 

.7187893 

10.198343 

.7187893 

5. 06036xlo“3 

Table  4.1;  Optimal  Gains  and  Costs  of  Example  4.1 


Now  suppose  that  there  is  a  jump  cost  charged  when  the  form  changes 
from  r=l  to  r=2.  Take 


Q(l,2)  =  2 

0(1,1)  =  0(2,1)  =  0(2,2) =1  . 


The  optimal  controller  parameter  and  closed- loop  dynamics 


for  this  case  are  listed  in  Tables  4.3  and  4.4,  respectively. 

Note  that  the  additional  expected  cost-to-go  caused  by  this 

penalty  is  slight:  about  1.25%  greater  from  rN  ^=1  and  0.70% 

greater  from  r  .=2.  Comparing  Tables  4.2  and  4.4  we  see  that 
N-e 

in  form  1,  the  closed  loop  optimal  system  drives  x  to  zero  a  little 
more  quickly  when  this  jump  cost  is  present. 


k 

\(1> 

V21 

Lk{1> 

N-l 

.5238095 

3.996004 

.5238095 

1.998xl0~3 

N-2 

.6634162 

7.4701392 

.6634162 

3.73506xl0“3 

N-3 

.7096495 

9.3545273 

. 7096495 

4.67726xl0"3 

N-4 

.7278253 

10.270011 

.7278253 

5.135xl0-3 

Table  4.3:  Example  4.1  with  Q(l,2)=2. 


k 

a(l)-b(l)L^(l) 

a(2)-b(2)Lk(2) 

N-l 

.4761905 

1.998002 

N-2 

.3365838 

1.9962649 

N-3 

.2903505 

1.9953227 

N-4 

.2721747 

1.9949865 

Table  4.4:  Closed- Loop  Optimal  Dynamics  when  Q(l,2)=2. 


Now  suppose  that  the  jump  cost  is  high.  Take 
Q (1>  2)  =  1000 

Q(l,l)  =  2(2,1)  =  Q(2,2)  =  1  . 

Then  the  optimal  strategy  in  form  1  is  to  drive  x  almost  completely 

2 

to  zero  in  one  time  step  (incurring  a  cost  of  about  u  R(l)=l). 

The  optimal  strategy  in  form  2  remains  the  same;  almost  no  control 
is  used.  The  optimal  cost  and  control  law  parameters  for  this 
high  jump  cost  case  are  listed  in  Table  4.5,  and  the  closed- loop 
dynamics  cure  in  Table  4.6. 


.9901864 

3.996004 

.9901864 

1.998x10 

.9903092 

9.1421301 

.9903092 

4.57196x10” 

.9903573 

11.190689 

.9903573 

5.594534x10 

.9903763 

12.005421 

.9903763 

6.00271x10” 

Table  4.5:  Example 

4.1  with  2 d/2) =1000. 

a(l)-b(l)L^(l) 


a(2)-b(2)Lk(2) 


’ 

L _ 

■% 

N-l 

9.8136X10-3 

1.998002 

i 

N-2 

9.6908xl0~3 

1.995428 

N-3 

9.6427xl0'3 

1.9944055 

S. 

1 

N-4 

9.6237x10' 3 

1.9939973 

Table  4.6:  Closed-Loop  Optimal  Dynamics  when  Q (1,2)  =1000. 


4. 3  Summary 

This  chapter  completes  our  study  of  JLQ  problems  with 
x- independent  forms.  As  we  have  shown  in  this  chapter  and  in  chapter  3 
the  linear  quadratic  optimal  control  problem  formulation  can  be 
extended  to  jump  linear  systems  in  a  straightforward  way,  provided 
that  the  jumps  cure  x-independent  and  perfectly  observed. 

In  parts  III  and  IV  of  the  thesis  we  will  consider  JLQ  problems 
that  involve  form  changes  that  are  x-dependent,  either  explicitly  or 
through  controls.  As  we  shall  see,  the  structure  and  behavior  of  the 
optimal  JLQ  controller  becomes  much  more  complex  in  these  cases  and 
displays  features  not  captured  by  the  problems  studied  to 


this  point. 


5.  SCALAR  JLQ  PROBLEMS  WITH  X-DEPENDENT  FORMS 


5.1  Introduction 

In  this  chapter  we  examine  a  class  of  nonlinear  stochastic  control 
problems  that  capture  the  active  hedging  issue  of  fault- tolerant 
optimal  control.  The  problems  under  consideration  are  scalar- in-x  JLQ 
problems  with  form  transition  probabilities  that  depend  on  x. 
Specifically,  we  consider 


form  transition  probabilities  that  are  (or 
can  be  approximated  as  being)  piecewise- 
constant  in  x. 


For  this  class  of  problems  we  develop  a  recursive  procedure  for  the 
determination  of  the  optimal  expected  costs-to-go  and  control  laws 
"off-line,"  in  advance  of  system  operation.  We  also  establish  a  number 
of  qualitative  properties  of  the  optimal  controller. 

The  optimal  expected  costs-to-go  are  piecewise-guadratic  and  the 
control  laws  are  piecewise- linear  in  x^,  in  each  form.  That  is,  the 
real  line  is  partitioned  into  a  number  of  intervals  of  x  values 
(  pieces  ),  and  over  each  such  interval  (x^r^j)  is  Quadratic'*'  in 
x^  and  u^fx^/r^j)  is  linear,  for  each  form  j  e  M. 

For  each  j  €  M  at  time  k  the  expected  cost-to-go  V  (x^r  *j)  and 
control  law  U]c^3t]c,r)c=^  have  the  same  number  of  pieces,  m^Cj).  In 
general  this  number  grows  as  (N-k)  increases.  A  typical  expected  cost- 
to-go  and  control  law  are  shown  in  figure  5.1. 


In  this  chapter  the  term  quadratic  in  x.  is  used  for  functions  of  the 
2  K 

form  a  +a  x^+a  x  ;  t*ie  term  iinear  is  used  for  functions  of  the  form 


The  different  pieces  of  V^tx^/r^j)  and  u^tx^r  =j)  arise  from 
using  the  control  to  actively  hedge.  Intuitively,  at  each  stage  the 
optimal  controller  must  take  into  account  what  the  expected  cost  of 
driving  x  into  different  regions  will  be,  where  different  values  of  the 
form  transition  probabilities  apply.  As  the  control  problem  is  solved 
backwards  in  time  from  a  finite  terminal  time  (using  dynamic  programming), 
the  controller  must  take  into  account  what  the  effects  of  active  hedging 
will  be  at  the  intervening  times. 

The  procedure  that  is  developed  here  for  computing  the  optimal 

, r^=j)  (inductively,  backwards  in  time  for 
finite  time-horizon  problems)  involves  the  computation  and  comparison  of 
a  growing  number  of  quadratic  functions  at  each  stage  and  for  each 
j  €  M  .  These  quadratic  functions  are  computed  via  Riccati-like  dif¬ 
ference  equations.  All  of  these  computations  can  be  done  off-line,  as 
in  the  x- independent  JLQ  problem. 

The  basic  idea  of  this  solution  procedure  is  simple;  essentially 
the  nonlinearity  of  the  system  dynamics  (due  to  the  x-dependence  of 
the  form  transition  probabilities)  is  converted  into  computational 
complexity  in  the  determination  of  v^(x^,r^=j) .  It  is  the  piecewise- 
constant  structure  of  the  form  transition  probabilities  that  allows  qs 
to  do  this. 

At  each  time  stage  k,  the  control  problem  involving  the  determination 
of  \(x]c'r]c*j)  ^or  a  system  having  the  full  hybrid  structure  (as  pictured 
on  the  left  of  figure  5.2)  is  transformed  into  the  comparison  of  many 


X 


FIGURE  5.2;  Conversion  of  nonlinearity  into  computational  complexity 


constrained- in-  x.  JLQ  control  problem  costs  with  x- independent 


form  transitions  (as  pictured  on  the  right  in  figure  5.2).  One 
constrained  problem  arises  for  each  region  of  x^+^  values  having  dif¬ 
ferent 

.  form  transition  probabilities  out  of  j 

(p. . ,  i  e  C.) 

31  3 

.  different  pieces  in  the  expected  costs-to-go 

at  the  succeeding  time  (i.e.,  Vk+i  ^xk+i'rk+l*^  ' 
for  each  form  in  the  cover  of  j  (i.e.,  i  e  1,). 

The  number  of  costs-to-go  that  must  be  compared  at  each  stage,  and 

, r^= j )  grows 

.  at  most  linearly  with  the  number  of  transition 
probability  pieces 

.  at  most  geometrically  with  the  number  of  forms  that 
are  accessible  from  form  j  in  one  time  step. 

The  "piecewise"  structure  of  the  optimal  expected  costs-to-go  and 
control  laws  is  caused  by  the  piecewise-constant  structure  of  the  form 
transition  probabilities. 

The  solution  procedure  developed  in  this  chapter  provides  an 
"approximately  optimal"  controller  for  problems  where  the  true  x-depen- 
dent  form  transition  probabilities  have  been  approximated  in  a 


the  number  of  pieces,  m^(j),  in  (x^ 


piecewise-constant  way.  Clearly  this  approximation  can  be  made 


arbitrarily  close  to  the  true  controller  by  using  a  fine  enough 
piecewise-constant  approximation.  Thus  there  is  a  tradeoff  between 


accuracy  of  the  form- transition  probability 
approximations  (and  the  resulting  optimal 
controller) 


vs. 


computational  complexity  in  the  off-line 
determination  of  the  optimal  controller  and 
in  the  number  of  controller  pieces  m^tj) 
that  must  be  implemented  on-line. 


Although  the  basic  idea  of  this  chapter  is  simple,  the  derivation 
and  presentation  of  the  general  result  involves  unavoidably  complicated 
notation  and  "bookkeeping"  problems.  For  this  reason,  this  chapter 
has  been  organized  as  follows: 

1.  In  section  5.2  the  general  problem  is  formulated. 

2.  In  section  5.3  one-stage  of  a  simple  problem  is 
solved  from  first  principles  . 

3.  Guided  by  intuition  gained  from  this  example,  a 
general  solution  procedure  is  developed  in  section 
5.4  and  certain  qualitative  properties  of  the 
optimal  controller  are  established. 

4.  In  section  5.5  this  solution  procedure  is  used  to 
solve  the  next  stage  of  the  example  problem. 

5.  In  section  5.6  a  number  of  combinatoric  properties 
(i.e.,  concerning  the  number  of  pieces  of  the  optimal 


solutions,  etc.)  and  qualitative 
properties  of  the  optimal  controller  are 
established.  These  results  are  motivated 
by  the  example  problem. 


From  the  study  of  the  optimal  controllers  developed  here  we  can 
gain  insight  into  the  structures  of  controllers  that  use  active  hedging, 
and  into  the  qualitative  effects  of  their  control  actions. 

In  chapters  6  and  7  the  results  of  this  chapter  are  used  to  in¬ 
vestigate  a  number  of  additional  qualitative  properties  of  the  controller; 
in  particular,  steady-state  behaviors  are  examined.  In  addition,  an 
algorithm  flowchart  that  efficiently  performs  the  calculations  specified 
in  section  5.3  is  presented.  In  Part  IV  this  algorithm  is  extended  to 
include  more  general  jump  linear  control  problems. 

5.2  General  Problem  Formulation 

In  this  section  we  present  the  general  problem  formulation  that 
is  addressed  in  this  chapter .  We  restrict  our  attention  to  the  time- 
invariant  case  so  as  to  simplify  notation  somewhat.  All  of  the  results 
of  this  chapter  can  be  directly  extended  to  the  time-varying  case. 

Consider  the  discrete- time  jump  linear  system 

xk+1=a(rk)xk  +  b(rj<)uk  (5.1) 


L9 


Vrk+i=j|rk=i'  Vi=x}  =  p(i'j;x) 


(5.2) 


x(kQ)  =  xQ 


r(ko)  =  ro 


Each  transition  probability  p(i,j;x)  of  the  form  process  is  assumed 
to  be  piecewise-constant  in  x,  having  a  finite  number  of  pieces  v 
That  is,  the  real  line  is  partitioned  into  v  disjoint  intervals 
with  the  transition  probabilities  taking  constant  values  over  each 
interval : 


if 


p (i, j  ;x) 


=  X. . (s) 
13 


(5.3) 


v.  .  (s-1)  <  x  <  u. . (s) 
x]  13 


where 

3  =  1,2, 


—OO 


A 


V. . (0)<  V. . (1) 

13  13 


<. 


V.  .  (V.  .-D< 

13  13 


13 


/-  X  A 

(Vf.)  = 


(5.4) 


These  grid  points  v (s)  may  be  different  for  each  pair 
(i, j)e  M  x  M.  For  all  s=l,2,  —  ,V 


X,.(s)>  0  for  each  i,j  €  M 

13 


M 


l 

j-1 

A  typical 


X.  .  (s)-l 
13 

p(i,  j;x) 


for  each  i  6  M 

is  illustrated  in  figure  5.3. 


I20 


The  form  process  is  not  Markovian  because  of  its  dependence 


r-' 

i 

L..- 

L. 

> 


m 


*a 


\{xo'ro] 


ukR(rk}  +  V1Q(W 

L+  s(rk+i)xk+i  +  p(rk+i) j 


+  x„K  (r  )  +  H  (r  )x  +  G(r) 
NT  N  TNN  T  N 


(5.5) 


where  the  expectation  is  over  {r^  ,  .../r^}  .  Since  {  (x^r^)  :  k=k0,...,N} 
is  a  Markov  process  we  need  only  consider  feedback  laws  of  the  form 


Here 


\  ■  V1 

w  • 

R(i)> 

0 

Q(i) 

S'(i)  /2  \ 

S(i)/2 

P(i)  / 

vi} 

h;u>/2 

HT(i)/2 

GT(i) 

>  0 


(5.6) 


>  0 


for  each  i  6  M.  We  will  assume  here  that  b(j)^0  for  each  j  6  M. 


1The  result  for  forms  where  b(j)=0  (that  is,  the  system  just  "coasts' 
in  fora  j)  is  presented  at  the  end  of  Appendix  C.2.  This  result  is 
used  in  some  of  the  examples  in  chapter  6  and  7 . 
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The  term 


xjiVV  +  WV  +  W 

in  (5.5)  is  a  terminal  cost  charged  in  addition  to  the  time- invariant 
cost 

xfTQ(r  )  +  x  S(r  )  +  P(r) 

N  N  N  N  N 

The  x^andx^  terms  (x^r^,  Pfr^),  x^t^)  andG^r^)  are 

included  in  (5.5)  because  they  naturally  arise  in  the  computation  of 

the  expected  costs-to-go.  Even  if  the  x-costs  in  (5.5)  are  simple 

quadratics (i.e. ,  S(i)  =  P(i)  =  HT(i)  =  GT(i)=0),  some  of  the 

quadratic  pieces  of  the  optimal  expected  costs-to-go  will  have  x^  and 
o 

x^  term  . 


5.3  One  Stage  of  an  Example  Problem 

In  this  section  we  solve  one  time  stage  of  a  simple  example 
problem  satisfying  (5.1)-(5.6).  This  is  done  to  illustrate  the  basic 
solution  idea  alluded  to  in  section  5.1,  and  to  gain  insight  into  the 
qualitative  properties  of  this  class  of  problems.  A  number  of  obser¬ 
vations  and  claims  inspired  by  this  example  are  listed  at  the  end  of 


this  section  for  later  consideration. 


Example  5.1: 

Consider  the 

i  following  system  having  M=2 

Vi 

u 

+ 

if 

ii 

H* 

Vi 

=  2xk  +  \ 

if 

rk=2 

p (1, 2 :x) 

m  J 1/4  |x| 

<1 

1  3/4  |x| 

>1 

p (1, 1  :x) 

=  1— p  (1,2  :x) 

p (2, 2) =1 

p (2, 1) =0 

We  seek  to  minimize 


(n-1  2  2  2  ) 

min  E  l  (u  ♦  Vl’  +  x«Kn.(r„>l 

V"'Vi  <k*° 


t  K  (r  )/ 
NT  M 


where  K  (1)=0,  K(2) =3 .  The  form  structure  and  form  transition 
T  T 

probability  p(l,2:x)  for  this  example  are  shown  in  figure  5.4. 

The  values  r=l  and  r=2  might  denote,  respectively,  "normal"  and 
"failure  mode"  operation.  The  (2)  parameter  represents  a  penalty 
charged  for  failure  of  the  system.  The  probability  of  failure 
p (1, 2 :x)  is  low  for  small  magnitude  x,  and  larger  if  | x | >1 . 

Once  the  system  fails  (attains  form  r=2) ,  it  stays  there.  In 
this  form  the  usual  LQ  solution  applies.  The  optimal  (deterministic) 
cost-to-go  is 

VW2>  *  ^V2’ 

for  k=N,N-l, . . . ,0  where 


(2)R(2)  [K^+1  (2) +Q (2)  ]  4  (\+l (2) +1) 

R(2)+b2(2)  [K^+1  (2) +Q  (2)  ]  2+Kk+l(2) 

and  the  optimal  control  law  in  form  r=2  is  given  by 


\(W2)  =  -V2i*k 


where 


V2) 


a(2)b(2)  [Kk+1(2)+Q(2)  3 
R(2)+b2(2)  tKk+1(2)+^(2)] 


21V!(2)+1) 

2+Kk+l(2) 


Here  we  have  quick  convergence  as  (N-k)  decreases,  to 


V2’  — 

1  +  /5 

=  3.236068 

V2’  — 

l  +  i/5 

2 

=  1.618034 

as  seen  in  Table  5.1,  below. 


Now  we  examine  what  happens  when  r  =1. 

N— 1 


We  are  given  that 


V (x  ,r  =1)  =  x  K  (1)=0. 
N  N  N  NT 


Now  consider  the  situation  one  stage  back  in  time.  With  probability 
pd^zx^)  the  system  will  switch  to  form  2  at  time  N,  and  we  will  be 
charged 


With  probability  [l-p{l,2:x  )]  the  system  will  stay  in  form  1  and 

N 

we  will  be  charged 


In  addition  we  will  be  charged  a  control  cost 


uiLRii>  ■  vi  • 


for  whatever  control  we  choose. 


That  is 


min 


Vi'Vi'Vi-11 


Vi 


p(l,l?xN)  txN+VN^xN/rNSl^  ^ 
+ 

pa.2:^)  t*X<V  V2)]  , 


min 

Vi 


N-l 

+ 


p(l,l:xN)xN 


p(l,2:XN)4xN 


Note  that  we  can  control  the  failure  probability  p(l,2:x„),  and  thus 
the  cost  incurred  at  time  N,  by  our  choice  of  (through  the  choice 
of  uN_1>.  It  is  this  point  that  makes  V^_ i  <XN_ i ' r m-  1~ 1  ^  a  non“ 
quadratic  function  of  However,  as  we  have  indicated, it  is 

piecewise  quadratic  and  this  is  a  direct  consequence  of  the  piecewise 
constant  nature  of  p(i, j ;x  ) .  The  basic  reason  for  this  is  actually 
quite  simple  auid  by  going  through  it  we  can  obtain  an  initial 
understanding  of  the  nature  of  the  problem. 

Suppose  that  x  .  has  a  given  value.  Then,  by  applying  our 

N- 1 

optimal  control,  one  of  three  things  will  happens  either  x^  <_-l  ,  or 
-l+<  x  <  1  or  x  >  1+.  In  each  of  these  cases  the  cost 


is  a  quadratic  function  of  x^.  Consequently  this  suggests  the 

following  strategy  for  computing  V.,  , (x„  ,r„  =1)  and  the 

N— ±  N— 1  N~ 1 

associated  optimal  control  law: 

For  each  of  the  3  possible  regions,  solve  the 

constrained  optimization  problem  assuming  that 

Xjj  is  in  the  specified  region.  As  indicated 

above,  each  such  constrained  problem  is  quadratic. 

Once  we  have  the  solutions  to  these  problems,  we 

compare  them  and  obtain  the  optimal  solution  by 

choosing  the  smallest  of  these  for  each  value  of 

x„  ,  .  As  we  will  see  the  result  is  a  piecewise 
N— l 

quadratic  cost-to-go  and  a  piecewise  linear 
optimal  control  law. 

As  we  have  indicated,  in  this  example  there  are  three  x„  regions: 

N 


(1) 

XN 

<  -1 

where 

p(l, 2 :x  ) =3/ 4 

N 

+ 

(2) 

-1 

*  XN  *  +  1_ 

where 

p(l,2:XN)=l/4 

(3) 

1+ 

where 

p(l,2:xN)=3/4  . 

The  three  corresponding  constrained  control  problems  are 


Vi'Vi'Vr1!11  =  11111 


(  2  13  2  ( 

)  N-l  4  N 


N-l '  N— lf  N-l  * 


=  ram 

Vi 


St. 


)  Vl  4  N  ( 


J 


(5.8) 


Vi(Vi'Vi=1l3) 


nun 

VlSt* 

1+±Vl 


2  ^  13 

u  +  — 

N-l  4 


X2  I 

M 


(5.9) 


Note  that  the  costs  in  the  first  and  third  regions  of  x  values  are 

N 

the  same,  because  of  the  symmetry  of  p(l,2:x)  about  zero. 

Consider  the  second  region: 


-1  <xn<L 


Differentiating  VN_1(xN_^,rN_^=l| 2)  in  (5.8)  with  respect  to  u^^ 
and  setting  the  derivative  to  zero,  we  find  that 


Vi(Vi>  =  - 6363636  Vl 


(5.10) 


with  the  resulting  cost 


Vi(Vi>  a  • 6363636  Vl 


(5.11) 


But  this  uN  ^  only  solves  (5.8)  if  the  x^j  that  results  from  it 


obeys  the  constraint 


That  is,  we  must  have 


-1  ”=  *s  =  Vi  -  -6364  Vi  4 1 


which  holds  if  and  only  if 


~2  * 75  <  xM  <  2.75 
N-l 


For  x  .  >2.75 
N-l 


the  best  value  of  in  the  interval  (- 


is  x  =  1  .  This  is  achieved  if 
N 


„  ,  .  Va(1)Vi . 

N-l  N-l  "  b(l)  1  "  *N-1 


and  the  resulting  cost  is 


VN-1(XN-1)  XN-1  "  2XN-1  +  2'75 


Similarly,  for  xN_^  <  -2.75,  the  best  value  of  Xjj  in  the 

interval  (-1,1)  is  at  x„  =  -1+.  This  is  achieved  with 

N 


VllVl 


(x„  ,  )  =  -1  -  X. 


'N-l 


and  the  resulting  cost  is 


Vi(Vi>  *  Vi  +  2Vi  +  2-75 


1 


Rounding  numbers  to  four  significant  digits . 


1,1) 


(5.12 


5.13 


Thus  the  optimal  cost-to-go  of  (5.8)  (where  xN  is  constrained  to  be 

in  (-1,1)  has  the  three-piece  quadratic  form  of  figure  5.5. 

The  unconstrained  cost  of  (5.11),  as  a  function  of  x  , ,  is 

N— 1 

indicated  by  the  dashed  line.  It  applies  for  x^j  1€(-2.75,  2.75); 

this  is  indicated  by  the  solid  over- line. 

The  constrained  cost  (5.12),  corresponding  to  making  x^=l  is 

depicted  by  the  dot-dash  line.  It  applies  for  x  ,  >  2.75,  as 

N— 1 

indicated  by  the  solid  line. 

The  constrained  cost  (5.13) ,  which  results  from  x  =-l+,  is 

N 

represented  (as  a  function  of  x„  .)  by  the  dotted  line.  It  applies  for 

N— 1 

Vi  <  ~2-75- 

Note  that  the  constrained  costs  ( (5. 12) , (5. 13) )  are  greater  than 
the  unconstrained  cost  (5.11)  except  at  a  single  point.  At  this  point 
their  vaiues  and  their  slopes  match.  This  fact  will  be  of  interest 
later  in  this  chapter. 

The  other  two  constrained  control  problems  (5. 7), (5. 9)  can  be 
solved  as  we  have  done  above  for  (5.8).  Their  solutions  have  only 
two  quadratic  parts  because  x„  is  not  constrained  in  one  direction. 

The  optimal  expected  costs-to-go  for  all  three  problems  ( (5.7)  — 


(5.9))  are; 


Vi(Vi,l'1) 


.764  7058  X 


Vi^-4-25 


ViVi'1*21  =< 


ViVi'1'31 


xj  .  +2x  +4.2499985 

N- 1  N- 1 

if 

x  ,  >  -4.2  5 

N-l  — • 

(5.: 

Vi+2W2-7  5 

i 

if 

Vii'2-75 

•6364  - 

if 

-2- 7  5  5-Vl-2 

[vrV!-’ 5 

if 

XM  .  >2.75 

N-l 

s 

(5.: 

jvr2Vi+4-25 

if 

x  ,  <  4.25 

N-l  - 

!-7647  Vi 

if 

Vi^4-25 

(5.. 

in  figures  5.6,  5.5, 

and 

5.7,  respectively 

»  constrained  problems 

(5. 

7)-  (5. 9) ,  we  are  : 

ready  to  compare  them: 


Vi(Vi'Vial)  =  ,  Vi'Vi'Vi*1^1 

t— X  t * t J 


This  is  done  graphically  in  figure  5.8. 

Choosing  the  lowest  of  the  three  constrained  costs  at  each 
x„  .  value,  we  see  that: 


VN_^(xN_^,r  ^»1|3)  of  example  5.1  is  indicated  by  the  solid 

overline,  where  the  dot-dash  line  represents  the  cost  of 

driving  to  x  =1+  and  the  dashed  line  indicates  the  unconstrained 
N 

solution  of  (5.9). 


‘F* 


(1)  V  ,  (x  ,r  =l|l)  is  optimal  from  xv,  =-«  until 

N-l  N-l  N-l  1  N-l 

it  crosses  Vl{Vl'rn-l=1l2)'  at 
xN_1  =  -6.7684932. 

(2)  .  (xx,  .,r  _  =1 1 2)  is  then  optimal  until  it 

N— 1  N-l  N-l 

crosses  V  (x  ,r  _  =1 1 3)  at  xXT  =  6.7684932. 

N— 1  N-l  N-l  N-l 


(3)  Then  V  (xXT  ,  ,r  =13)  is  optimal  for  all 
N—  1  N—  1  N—  1 


larger  x. 


N-l. 


From  (5.14)- (5.16) ,  we  find  the  values  of  the  costs  at  their 
intersections  is  38.84. 

Collecting  the  above  information  we  have  that  the  optimal 
expected  cost-to-go  from  (x^_^,r  ^»1)  is 


Vi’Vi'Vr11 


7647058  x2  , 

N-l 

if 

vAi111 

Vi+2xn-i+2’ 7499997 

if 

.6363636  x2  , 

N-l 

if 

vr2Vi+2* 7499997 

if 

Vi(3)-Vi-sn-i 

.7647058  x2  . 

if 

5..  ,  (4)  <x„  .  . 

The  optimal  control  laws  are 


ViVi'Vr1'  - 


7647058 

Vl 

if 

“N-l"1 

if 

6363636 

XN-1 

if 

Vi<2,-Vi-sm-i(3) 

:N-1+1 

if 

7647058 

XN-1 

if 

(5.18) 


and  the  value  of  x„  obtained  by  application  of  the  optimal  control 
N 


law  is 


.2352942 
.+ 


Wi'Vi"1*  = 


-1 

.3636364  x 


Vi 

if 

Vl-Vl(1) 

if 

Vi(1)iViiVi(2) 

Vl 

if 

W2,iViiVi(3> 

if 

Vi(3’iVi*Vi(4> 

Vi 

if 

Vi‘4)-Vi 

.2352942 

l-i  N-l  —  N-l 

(5.19) 

where  we  denote  the  "joining  points"  (where  these  quantities  change)  by 


-6.77 

-  -  Vi14’ 

i.-i(2)  ’ 

-2.75 

-  -Vi<3’ 
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The  notation  x  (x  , ,r  ,)  in  (5.19)  is  used  for  the  optimal 
N  N-l  N-l 

value  of  x .  that  is  obtained  in  form  r  ,,  as  a  function  of  xx,  . 

N  N-l  N-l 

This  notation  will  be  used  in  the  remainder  of  the  thesis.  The 

optimal  expected  costs-to-go,  control  laws  and  obtained  xN  values 

are  illustrated  in  figures  5.9,  5.10  and  5.11,  respectively. 

Figure  5.9  (V,  , (x„  . ,r  =1)  has  purposely  not  been  drawn  to  scale 
N—l  N— 1  N— 1  - 

so  that  the  behavior  at  the  joining  points  can  be  clearly  seen. 

In  light  of  the  solution  of  this  last- stage  example  problem 

we  make  the  following  observations  and  claims : 

1.  From  (5. 17)- (5. 18)  we  see  that  in  this  example 

the  optimal  expected  cost  V„  _ (x„  ,r  =1)  is 

N-l  N-l  N-l 

piecewise-quadratic  in  and  the  optimal 

control  law  is  piecewise-linear.  When  we  go  back 
another  stage  in  time,  the  optimal  cost 
V  2^xn  2,rN-2=^  can  be  obta^nei^  using  a  similar 

approach.  Things  become  more  complicated,  however. 

Specifically,  one  step  further  back  we  will  confronted  with  another 
optimization  problem  to  compute  the  optimal  cost-to-go.  Following 
the  same  procedure,  we  can  break  the  real  line  into  regions  in  which 
the  function  to  be  optimized  is.  quadratic.  In  doing  this  we  must 
take  into  account  into  what  x„  ,  piece  of  V  (x  ,r  =1)  we  are 
driving  the  system  as  well  as  into  what  probability  piece. 
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xn(xn-i»i> 


FIGURE  5.11:  x„  values  obtained  from  (x  .  ,r  .=1)  using  the  optimal 
-  N  N- 1  N- 1 

controls,  in  example  5.1. 


It  is  intuitively  obvious  that  the  optimal  expected  cost-to-go 


-quadratic  in  x^  (and  the  controls 
piecewise  linear)  at  each  time  stage  k.  The  bookkeeping  details  of 
this  will  be  taken  care  of  in  Section  5.4. 


V  (x^r  aj)  will  be  piecewise 


2.  In  figure  5.9  we  see  that  at  S  ,  (2)  and  5  ,  (3) , 

N— 1  N— 1 

the  optimal  expected  cost  has  continuous  slope.  At  <5  ,  (2) 

N—l 

and  6  ,  (3) ,  the  slope  decreases  discontinuously. 

This  illustrates  a  general  property:  at  its  "joining 

points"  {5^(t):  t=l, . . . ,M^( j)-l}  the  slope  of  the 

optimal  expected  cost-to-go  V.  (x.  ,r  *j)  i-s  continuous 

or  it  decreases  discontinuously.  (see  Proposition  5.1). 


3.  In  figure  5.10  we  see  that  at  <$„  .  (1)  and  <S  ,  (4), 

N— l  N-l 


the  optimal  controls  are  discontinuous  in  x^_^. 

However  at  $„  , (2)  and  ,  (3)  ,  the  controller  is 
N- 1  N-l 

continuous  (and  the  optimal  cost  is  differentiable ) . 
In  general,  at  each  of  its  joining  points 

6^(t)  the  optimal  control  law  u^  (xjc,rjc=j) 


discontinuous  if  and  only  if  the  slope  of 

Wrlc 


V^(x^,rk=j)  decreases  there, 


=j) 


is 


continuous  but  not  differentiable  if  and  only  if 
Vk (x^/T^j)  is  differentiable  there.  (see  Proposition 


5.3  in  Section  5.6). 


4.  Note  that  for  x^_^  negative  enough,  the  optimal 
controller  (in  one  time  step)  does  not  drive  x 


into  a  different  probability  piece.  That  is, 

for  x„  ,  <  5  ,  (1) ,  the  optimal  controller  keeps 

N-l  N—l 

x^  <  -1.  Similarly  for  xN  ^  large  enough,  the 
optimal  controller  keeps  x„  in  the  same  proba- 

bility  piece;  for  XN-1  >  ^^(4),  we  ^et*' 

*N  " 

In  general  V  (x^,r  -j)  and  u^Cx^.r^j)  have 
extreme  regions  of  x^  values  (left  endpieces 
for  x^  <  6^  (1)  and  right  endpieces  for 
x^  >  <5^  (m^(j)-l))  from  which  the  optimal  con¬ 
troller  will  never  (through  the  terminal  time  N) 
drive  the  system  into  a  different  piece  of  the 
form  transition  probabilities  p_.^  (for  any  form  i 

accessible  form  j).  The  properties  of  these  end- 
pieces  will  be  addressed  in  detail  in  Chapter  6. 

Let  r  2 

~  A  |*<+l2  rk+1>  +  S(rk+l)xk+l  +  P(rk+1) 

Vl(VllVj)  -  vk+1(*kM,rktu 

denote  the  conditional  expected  cost-to-go  from 
(xJc+1,rk+1)  given  that  r^=j  .  This  is  a  function  of  x^+1* 

For  this  example,  the  conditional  expected  cost 

A  i 

V..(x„  r  =1)  is  shown  in  figure  5.12,  and  is 
N  N  N—  1 


given  by 


As  we  shall  see  in  later  sections  of  this 
chapter,  the  behavior  of  this  conditional 
expected  cost  function  is  intimately  related 
to  qualitative  properties  of  the  optimal 
controller  and  combinatoric  properties  of  the 
solution. 

One  relationship  is  apparent  from  example  5.1:  active  hedging  to 
a  point  from  (x^/r^j)  occurs  only  to  points  x^+^  where  the  con¬ 
ditional  expected  cost  ^x]c+i  I  d^scontinuous'  these 

points  can  only  arise  from  form  transition  probability  discontinuities. 
This  will  be  proved  in  section  5.6. 

For  values  between  6,,  ,  (1)  =  -6.7  7  and  6xt  .(2)  =  -2.75, 

N-l  N-l  N-l 

the  optimal  strategy  is  to  drive  x^^  into  (-1,1),  where  the  conditional 

A  i  + 

expected  cost-to-go  V  (x  Ir  =1)  is  lower.  Thus  we  have  x  =-l  here. 

N  N1  N-l  N 

Similarly  for  x„  .  values  between  <5  ,  .  (3)  and  5.,  ,  (4)  we  get  x  =1  . 

N— 1  N-l  N-l  N 

In  these  two  regions  of  x  ,  values,  the  optimal  controller  actively 

N-l 

hedges  to  a  point.  That  is,  it  uses  control  ^  to  alter  the 
probability  of  failure  p(l,2;xN)  value.  In  this  example  the 
system  actively  hedges  only  to  points  Xj^  that  are  transition 
probability  discontinuities. 

6.  For  . (2)<  x,  ,  <  <5  ,  (3) ,  the  optimal  controller 

N-l  —  N-l  —  N-l 

doesn't  have  to  actively  hedge  since  the  system  is 
driven  into  (-1,1)  by  (5.10)  anyway. 


This  is  true  in  general  for  systems  with  purely  quadratic  costs 


(i.e.,  S(i)  =  p(i)  =  H^i)  =  GT(i)=0,  all  i  e  M) .  For  such  systems, 
V^tx^r  =j)  and  ^(x^r  =j)  have  middle  pieces  containing  x=0, 
from  which  the  optimal  controller  never  (through  terminal  time  N) 
drives  the  system  into  a  different  piece  of  the  form  transition  pro¬ 
babilities  p  ^  (for  any  form  i  accessible  from  j).  The  existence  and 
properties  of  middle  pieces  will  be  addressed  in  chapter  6. 


7. 


In  figure  5.11  we  can  see  that  certain  x^ 
values  are  never  obtained  by  the  optimal 
controller.  In  particular,  we  have 

xn  (-1.589,-1)  ,  x  $  (1,  1.589). 


These  regions  of  x^  avoidance  are  state  values  that  the  system  must 
avoid  if  it  is  to  be  optimally  controlled.  Note  that  these  regions 

of  Xj^  avoidance  correspond  to  xN_1  values  where  vN_i ^XN-l'rN-l=1^ 

is  not  differentiable  (ie. ,  ,  (1)  and  5  ,  (2)). 

N-l  N-l 

In  general,  there  is  a  region  of  x^+1  values  that  is  avoided 
from  (x]c'rk=j)  corresponding  to  each  nondif ferentiable  point  of  the 
optimal  expected  cost  Vk(x^,r^=j).  This  is  shown  in  Proposition  5.3 
(in  section  5.6). 

In  the  next  section  we  will  develop  a  procedure  for  the  solution 
of  one  stage  of  the  general  problem  formulation  of  section  5.2, 
and  we  will  verify  the  first  two  of  the  seven  claims  above. 


I 


8 


5.4  One  Stage  of  the  General  Problem 


In  this  section  we  use  intuition  gained  from  the  example  problem 
of  the  last  section  to  solve  the  optimal  control  problem  of  section  5.2 
for  one  time  stage.  As  we  indicated  earlier,  the  notation  and  "book¬ 
keeping"  becomes  quite  complex,  but  the  basic  idea  is  the  same  as 
illustrated  in  the  previous  section.  Inductive  application  of  the  one 
stage  solution  (backwards  in  time  from  finite  terminal  time  N)  then 
establishes  that  the  solution  of  problem  (5.1)- (5.6)  yields  optimal 
expected  costs-to-go  that  sire  piecewise-quadratic  in  x  and  optimal 
control  laws  that  are  piecewise- linear ,  for  all  forms  j  6  M; 


Vk(xk'rk=j)  15  XkKk(t:;j)  +  +  Gk(t:^  (5.20) 

\(xk'Vj)  =  _Lk(t:j)xk  +  Fk(t:j)  (5.21) 

when 

fij|(t-l)<  <  fij(t)  ,  (5.22) 


where 


{6^ (IX  «j»(2)  <...<sj(mk(j)-l)} 

sure  the  points  where  the  pieces  of  v^fx^j)  are  joined  together 
(the  boundaries  of  the  x^  -intervals)  and 


«£(0)  =  -», 


6k(mk(j))  “  “  ' 


The  proof  of  the  one-stage  optimal  controller  result  is  cons¬ 
tructive.  It  suggests  an  algorithm  for  the  recursive  determination 
of  the  optimal  expected  costs-to-go  and  control  laws  for  this  problem. 
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The  one-stage  solution  result  is  as  follows: 


Proposition  5.1:  (One  stage  solution) 

Consider  the  problem  of  section  5.2.  If  at  time  k+1,  for  each 


r,  ,*  j  6  M  we  have 
k+1  — 


(i) 


V, 


'k+l  (x^^/r  i«j)  is  piecewise-quadratic  with 


m  (j)  pieces  joined  continuously  at 

KtI 


SL(2)  <-< 


(iil  !Wt:1)/2' 


>  0 


for 


(iii)  3Vk+1(Vl,rk+1.j) 


J*k+1 


is  continuous  or  decreases  discon- 


tinuously  at  the  joining  points  {<5^+1  (1)  , . . . ,  6^+1  (m^^j)-!)  ) « 


then  for  each  r,  =  j  e  M 
k  — 


(1)  Vk  ^xk'rk=j^  i-s  piecewise-quadratic  and  u^Cx^rr^j)  is 
piecewise- linear  (as  in  (5. 20)- (5. 22)) , each  having 


m^(j)  pieces  joined  continuously  at 


(2) 


/\(t:j)  Hk(t:j)/2 

' (t : j ) /2  G^trj) 


>  0 


1/2^ • • • r  ( j ) 


(3)  8vkl\'Vj)  , 


is  continuous  or  decreases  discon- 


tin  uously  at  the  joining  points  {<$?.  (1)  , . . . ,  (m  ( j) -1)  }  . 

JC  K  K 


At  time  k=N,  conditions  {  i-iii)  are  clearly  satisfied.  Thus  this 
proposition  can  be  applied  inductively,  backwards  in  time  from  k=N. 
Equations  for  the  iterative  computation  of  the  quantities  mk(j)  > 
K^t.-j),  Hk(t:j),  Gk(t:j)  and  {6^(2,):  il»l, . . .  ,nL(  j) -1}  for  each 
i,  j  e  M  are  listed  in  appendix  c.l.  These  equations  are  developed 
in  the  proof  of  Proposition  5.1,  which  constitutes  the  remainder  of 
this  section  (with  some  details  in  appendix  C.l). 


Proof  of  proposition  5.1: 

For  each  form  r  s  j  e  M,  the  minimization  in  (5.5)  subject  to 

JV 

(5.1) -(5. 2)  is  converted  into  the  comparison  of  a  finite  set  of 
constrained  -in-x  ^  JLQ  problems,  each  with  x-independent  forms. 

This  is  done  conceptually  via  the  following  four  steps: 

Step  1:  Obtaining  a  composite  Partition  of  xk+^  values  from  the 
partitions  associated  with  the  form  transition  proba¬ 
bilities  p(j,i;x)  and  the  expected  costs-to-go 


15 


V,  ,  (x,  ,  ,r  =i)  for  each  i  e  C..  Note  that 
k+1  k+1  k+1  D 

the  partitions  are  of  x^+^  values  for  each  dif¬ 
ferent  form  at  time  k  (not  at  k+1) . 

Step  2:  Formulating  a  set  of  constrained  (in  x  ^)  JLQ  problems 
having  x- independent  form  transition  probabilities  and 
quadratic  costs;  one  problem  for  each  region  of  xk+1 
values  in  the  composite  partition  of  Step  1. 

Step  3;  Solving  the  constrained  subproblems  that  are  formulated 
in  Step  2.  These  problem  solutions  represent  the 
optimal  expected  costs-to-go  from  (x  , r  =j)  if 

x,  ,  is  constrained  to  be  in  one  of  the  specific 
k+1 

regions  of  values  defined  in  Step  1. 

Step  4:  comparing  the  constrained  costs.  The  optimal  expected 
cost-to-go  V^(x^,r^=j)  from  any  x^  value  is  the 
minimum  of  the  constrained  expected  costs-to-go  that 
are  obtained  in  Step  3.  This  minimization  involves  the 
comparison  of  piecewise-quadratic  functions  in  x^. 

We  will  describe  each  of  these  conceptual  steps  in  sequence  so  as  to 
demonstrate  the  validity  of  Proposition  5.1.  The  actual  solution 
algorithm  (as  described  in  chapter  7)  mixes  these  steps  and  uses  the 
combinatoric  results  of  section  5.6  to  solve  the  control  problem  ef¬ 
ficiently  (i.e.,  with  fewer  calculations.) 
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‘k+1  ' 


For  each  form 


j  6  M  we  construct  a  composite  partition  of  the  real  line 

(i.e.  for  x^+^  values)  by  superimposing  the  grids  associated 

with  each  p(j,i;x)  and  V,  (x,  .,r  ,  =i)  ,  for  all  i  e  C.. 

k+1  k+1  k+1  j 

The  construction  of  these  composite  grids  is  first 
illustrated  for  an  example  system  below.  The  general 
procedure  is  then  specified. 


Example  5.2: 

Consider  a  system  whose  form  structure  is  as  shown  in  figure  5.13. 
This  system  might  represent  the  following  situation: 


p(l, 2:x) 


p(2,l) 
p  (1, 3  :  x) 


normal  operation 

degraded  operation  (repairable  failure) 

nonrepair able  failure 

one- step  probability  of  repairable 
failure  occurrence  (x-dependent) 

jne-step  probability  of  repair 

one-step  probability  of  nonrepairable 
system  failure  occurrence. 


The  form  transition  probabilities  from  r^=l  are  piecewise  constant  in  x 
(but  p(2,l),  p(2,2)  and  p(3,3)  are  x- independent) . 


FIGURE  5.13:  Form  structure  for  example.  5.2.  Form  3  is  an 
absorbing  form.  Here  C  ={1,2,3} ,  c  ={l,2}, 
c3={3}.  1  2 


(p(l,l;x)  p (1, 2 :X)  p (1 , 3 :X) ) 


p(2,l)  =  p(2, 2)  =  .5 


(.89 

.1 

.01) 

if 

|x|  < 

1 

(.7 

.2 

.1  ) 

if 

l<|x| 

<  2 

(  0 

.2 

.8  ) 

if 

Ixl  > 

2 

Thus  the  numbers  of  pieces  in  each  of  the  form  transition  probabilites 
are 


V11  =  V12  =  5 


V12  =  3 


V21  V22  ~  V23  V31  “  V32  V33  X* 


The  x-dependent  transition  probabilities  are  shown  in  figure  5.14. 

Suppose  that  at  time  k+1,  the  number  of  pieces  in  each  expected 
cost-to-go  'W’WW0  is 

"W1”5  %.1(2I=5  "hcl'31'1 

as  illustrated  in  figure  5.15.  Superimposing  the  appropriate  par¬ 
titions  for  each  form  r  =j  e  M,  we  obtain  the  composite  partitions 
of  xk+1  values  shown  in  figure  5.16.  The  number  of  pieces  in  each 
partition,  denoted  by  ^  _ 


are 


FIGURE  5.16; 


Composite  x^+^  partitions  for  example  '.1;  (a)  for  r^=l 
the  partition  has  ^  =9  pieces,  (b)  for  r  =2  the 
2 

partition  has  ’^jc+2-=^  pieces»  (c)  for  r^=3  the  partition 
has  ^  =1  pieces. 


The  xk+1  intervals  are  denoted  by 


(t) 


k+1  '  . . Tk+1 

with  boundary  values  (grid  points) 


“a  Yk+i(t)  • 

C 

The  general  procedure  for  obtaining  the  composite  partitions  is 
as  follows: 

For  each  r  =j  e  M  the  real  line  can  be  divided  into  a  finite 
k  v  — 

number  of  intervals  of  values  by  superimposing  the  grids 


{6k+l(t):  . . . /m^+1(i)-l} 


{Vj±(t):  t»l,...,V^-l> 

for  each  i  e  C. 


obtaining  the  composite  partition 


A  j 

—00  25 


Yk+1(0)<  ',k+ia)<  ^k+1  ^k+11  *  “ 


of  unique  grid  points.  As  in  example  5.2,  we  define 


=  the  (finite)  number  of  such  nonempty 
x^+1  intervals 


th 


where  the  t  such  interval  is 
A 


AjL,(t>  =  (x„^:  x  .  <  Y?  (t)  } 


Note  that 


k+1 


1+ 


U  Rv..(£):£=l,. 

ISC.  L*  31 
3 


(£) :£=1, 


(5.23) 


where  |a|  denotes  the  cardinality  of  a  set  A  (the  number  of  elements) 
An  upper  bound  on  is  given  by 


[V..  +  m.  , . (i) -2]  (5.24) 

ji  k+1 

j 

where  the  equality  in  (5.24)  holds  if 

V*”*  vjn(o)  «J+1(t) 

for  all  i,  n€C ^  ,  £=1, 2, . . .  ,\K^-1  t=l,2  , .  . .  ,1^+1  (i)  ~1  and  p  =1 , .  . .  ,  v jn 

Note  that  in  example  5.2,  the  bouncfeof  (5.23)  are  not  tight  because 
of  the  overlapping  values: 


*iUi 1 1  + 


l 

iec 


-3  -  -  sha) 


3  -  <wi3)  -  Ci(4> 


■2  =  V1]L(1)  -  V13(l) 


2  =  V  (4)  =  V13(4) 


-1  -  V-  i  (2)=v  (1)=V  ,  (2)**6^  (2)  l=vi  -i  (3)  =v.  , (2)  =V  (3)  =6 


Step  2:  Formulating  the  Constrained  Subproblems 


In  Step  1  we  obtained  for  each  r  =]  e  M  a  composite  par- 

Jv 

tition  of  va^ues  into  ^  intervals.  We  can  formulate 

for  each  r  =j  e  M  a  set  of  lP  ,  constrained  JLQ  problems 
k  —  k+1 

having  x^+^  -independent  form  transition  probabilities 
and  quadratic  (not  piecewise-guadratic)  expected  costs — 
one  corresponding  to  each  region  of  xk+1  values.  To  see 
this  note  that  over  each  such  region  A^+^(t), 

'WWW11  is  quadrati‘: 

p(j,i;x,  , )  is  constant  in  x,  ,  ,  for  all  i  6  C..  These 
k+1  k+1  3 

constrained  problems  are 


vkt\'VjIVie^+i(t)I  '  * 


min 

i.  s.  t. 


u£"j)  +  iis(Vi’ 


♦  s<rk+i>xk+i  +  Plrk+i) 


V  Ak.i(t>  v+  'w'Vi'Vi1 


(5.26) 


W  Ak+l(t) 


\  R(j) 

M  +  \+l2(i)  +  Xk+lS(i) 

1  p(j,i,xk+1)  +  P(i) 

L=1  +  V,  . .  ( x,  ,  f  i ) 
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\s-t-  i  vlj)  +  v/viiv11 

Vie  4+l(t) 


subject  to  (5.1)-(5.3)  for  each  t=l,2, . . .  . 

Step  3:  Solving  the  constrained  subproblems 


The  third  step  in  this  constructive  proof  of  Proposition 
5.1  is  to  solve  the  constrained  JLQ  problems  of  (5.26). 
As  in  example  5.1,  for  each  r  =j  e  M  the  solutions 

it 

of  these  constrained  optimization  problems 

involve  optimal  expected  costs-to-go  that  are  piecewise- 
quadratic  in  x^  with  three  parts  (except  for  t=l  and 
t  =  which  have  only  two  parts) : 

if  <  9’(t) 

v£'°(Vj) 

vk'R'Vj>  ^  9J<t)<  .oi^ 

(5.2 

with  corresponding  optimal  control  laws 


Vwjlvie  ^+i(t)J 


\,L(xk'j)  if  a(j)xk  <  0k(t) 

«J,u(Vj)  if  9k(t)<  a(3>V  ©k(t) 

u^'R(Vj)  if  0k(t)<  a(j)xk 


(5.30) 


The  derivation  of  expressions  for  these  control  law  and  expected  cost 
pieces  involves  straightforward  (but  tedious)  algebraic  manipulations 
that  cure  described  in  Appendix  C.2.  Formulae  for  the  quantities  in 
(5. 29) - (5. 30)  cure  listed  for  reference  in  Appendix  C.l. 

As  in  example  5.1,  one  piece  of  each  expected  cost-to-go  in  (5.29) 
and  control  law  in  (5.30),  denoted  by  V^'U(x  ,j)  and  u^'^x^fj), 
corresponds  to  performing  the  minimization  in  (5.26)  without  the 
x^+1€  £i^+1(t)  constraint.  The  functions  V^'^x^j)  and  u£'°  (x^, j)  solve 
(5.26)  only  for  those  x^  values  for  which  the  constraint  xk+1e  A^+1 (t) 
is  inactive,  that  is,  where  application  of  the  control  that  minimizes 
(5.26)  results  in  an  x^+1  value  in  the  interior  of  A^+1(t).  We  define 
0^(t)  and  9^(t)  so  that  these  x^  values  (where  V^'^x^j)  applies)  satisfy 

e£(t)<  a(j)xk<  0j^(t). 

For  t*2,3, . . .  ,^+1  smother  piece  of  V^x^  j  |t)  in  (5.26),  denoted  by 

by  j) ,  corresponds  to  driving  x^+^  to  the  left  boundary  of  con. - 

straint  region  a£+1  (t) .  That  is,  x^  =  [Y^+]_(t-l)  ]  +  .  The  functions  v£,L(a 

(x^, j)  solve  (5.26)  for  those  x^  values  where  the  constraint 
xke  A^+1(t)  is  active,  and  where  u^'^x^j)  results  in  xk+1<>'^+1  (t-1) . 

That  is,  v£'L(xk,j)  and  uJ'L(xk,j)  solve  (5.26)  for  a(j)xk<  ej(t). 


For  t*l,2,..,  +  j  - 1  another  piece  of  V^x^/jlt)  (5.26) 


.t,  R 


denoted  by  corresponds  to  driving  to  the  right 


boundary  of  constraint  region  A^+1(t).  That  is,  x^+1  =  [Y^+1(t)]" 


t  R  t  R 

The  functions  V^'  (x^/j)  and  (x^/j)  solve  (5.26)  for  those  x^ 


values  where  the  constraint  A^+^(t)  is  active ,  and  where  u^,U (x^, j ) 
results  in  x^+1  >  Y^+1(t).  That  is,  V^'^x^j)  and  u^'^x^j) 
solve  (5.26)  for 


a(j)xk  >  0^(t)  . 


For  t~^+1  there  is  no  finite  right  boundary  of  (i.e. , 


j  A  Vi'L 

Yk+i (^+i)  *  °°)  ,  so  there  is  no  (x^ ,  j )  piece;  thus 


®k^k+l^  ■  +80  in  (5. 29)- (5. 30) .  We  sumnarize  the  solution  to  (5.26) 


in  table  5.2. 


For  t=2,3, . .  ,*p  j-1  a  typical  three-part 


Vk^Xk'rk*^Xk+ie  ^k+l^^  looks  like  either  (a) ,  (b)  or  (c)  of  figure 
5.17.  Here  the  quadratic  (in  x^)  function 

t,L( 


Vk <Vj) 


is  denoted  by  the  dotted  line 


v£'B(vjl 


is  denoted  by  the  dashed  line 


v5,R<X  , j) 


is  denoted  by  the  dot-dash  line 


The  solid  line  in  each  figure  indicates  which  of  these  three  cost 
functions  applies  over  various  x^_  values. 

The  three  different  possible  shapes  of  V  (x^,j|t)  shown  in 
figure  5.17  arise  from  the  relative  values  of  the  minimal  points 


of  V*'L(«jE,j) ,  j)  find  V^'^x^j).  At  =  0 £(t)/a(j) 


t  Xj  u 

the  values  and  slopes  of  V  '  (x^, j)  and  V^'  (x^j)  are  the  same. 


H  t  R 

At  x^  *  0^(t)/a(j),  the  values  and  slopes  of  V^'  (x^»j)  5111,3 


’  '  (x^, j)  are  the  same.  At  all  other  x^  values,  the  constrained  costs 


are  greater  than  the  unconstrained  costs.  That  is,  for  t*2, . . . 


Vk,L(Vj)>  Vk'U(Xk,j)  xk  f  ej(t)/a(j) 


»t  ,1* 


(Xk,j] 


x.  = 


9‘<t) 


ej(t) 


a(j) 


a(j) 


(5.31) 


K'V11 


d\ 


*£V 


ej<ti 


3*k 


ek(t) 


(5.32) 


a(j) 


a(j) 


and  for  t=l, . . .  ,^+1~l: 


<«=> 


\'Zi» 
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have 


As  is  evident  in  figure  5.17,  since  (x^  j)  and  (x^ii) 

the  same  curvature  it  follows  that: 

vk'R(Vj)<  Vk'L(xk'j)if  a(j)xk  >  Qk(t)  (5*35) 

Vk'L(xk/j)<  Vk'R(Xk'j)if  a(j)  *k  <  9k(t)  '  (5*36) 


Step  4:  Comparing  the  Constrained  Costs. 

The  fourth  step  in  this  proof  of  Proposition  5.1  is  to 
compare  the  solutions  of  the  constrained  JLQ  problems 

specified  by  (5.26).  For  each  r^  *  j€M,  V  (x^^r^-j) 
at  each  value  is  the  smallest  of  the  constrained 
costs  in  (5.29).  That  is. 


Vv(xi,'rva7)  * 


This  minimization  involves  the  comparison  of  piecewise-quadratic 
functions  in  x^. 

In  principle  we  can  use  (5.37)  to  find  V  (x  ,r  =j)  and 

K  1C  JC 

(that  is,  the  quantities  K^ttj),  H^tsj),  , 

^(tsj),  Fk(t:j),  {6^(t)s  t=l,...,n^(j)-l}  and  mfc(j)  as  in  (5.20)- 
(5.22)).  This  minimization  was  done  graphically  for  example  5.1. 

In  general,  we  must  accomplish  the  minimization  of  (5.37)  by  finding 
the  intersections  of  the  (3i|^  -2)  quadratic  functions 

Jv*  X 


( 


Vk'U(Vj)'  Vk,R(xk,j)'  Vk,L(xk'j)/  Vk°(xk'j)'  j) '  *  *  *  ' 


3 

V  k+i~l  ,L 

vk  (xk'j)'Vk  (xk,j),Vk 


*  k+r1'0  *,j..,-i.R  *  L  -  'u 


k+l-1'R 


k+1* 


V  k+l' 


(xk,j),Vk  (\^J'vk  (xk,j) 

(5.38) 

and  choosing  Vk [x^ , rk= j 1  at  each  value  of  x^  to  be  the  one  having  the 
lowest  value  there  (for  those  costs  that  are  valid  at  x^).  Thus 
V^Ix^/r  “j]  is  piecewise-quadratic  in  x^  and  u^x^r^j)  is  piecewise- 
linear,  as  claimed  in  (1)  of  Proposition  5.1.  The  verification  of  (2) 
in  the  proposition  is  straightforward. 

The  fact  that 


9VVVJ) 


8\ 


is  continuous  or  decreases  discontinuously 


at  the  joining  points  {6^(1) , . . . , 5^ (mk ( j) -1) }  follows  directly  from  the 


comparison  in  (5.37):  a  particular  joining  point  6k(£)  can  arise 


in  two  ways : 


(1)  two  (or  more)  of  the  constrained  costs-to-go  in 
(5.37)  may  cross  at  <5^(i).  Since  V  (x^r  »j) 
is  the  smallest  candidate  cost  at  each  x^  value, 
the  slope  of  V^tx^r  “j)  must  decrease  discontinu- 
ously  at  such  a  S^(i)  .  This  is  illustrated  in  figure  5 .18 . 


(2)  ray  be  an  x^  value  where  the  optima]. candidate 

cost  in  (5.37)  changes  from  a  constrained  piece  to 
an  unconstrained  piece  (or  vice  versa) .  That  is. 


5^ ( l)  corresponds  to  either 

JC 


•  X^  *  0^(t)/a(j)  where  V^x^jjt)  changes 


from  V^'Nx^j)  to  V*'°(Vj)  and  3V]c(x]c,  j  |  t)/3^  is  continuous  or 


*  \  *  9^(t)/a(j)  where  Vk(x^.,j|t)  changes  from  v^,U^xk'3) 


,t,R 


Vk  (x^j)  and  3Vfe  (xj^,  j  1 1) /3xk  is  continuous. 


This  concludes  the  proof  of  the  one-stage  solution  given  by 
Proposition  5.1.  Certain  qualitative  properties  of  the  optimal 
controller  that  are  developed  later  in  this  chapter  and  in  chapters  6,7 
will  be  used  to  simplify  the  procedure  that  is  described  above.  The 
actual  solution  algorithm  is  presented  in  chapter  7.  In  the  next  sec¬ 
tion  we  will  demonstrate  the  application  of  steps  1-4  in  the  next  stage 


(k*N-2)  of  example  5.1. 


since  we  only  consider  r=l. 


72 


f] 


From  (5. 17) - (5. 18) ,  the  solution  at  stage  k=N-l  is 

Vi'Vi'Vf11  ■  ViVi<t;1)  *  ViVi(t:1>  +  Vilt;1) 
Vi'Vi-vr1’  ■  "Vi(ta)vi +  Vi(til) 

£=r  Vl11’1"  Vl  •=  5n-1  4  Vl(t> 


where 


m  (1) =5 
U-lv  ;  3 


L  (0)  =  -00 

<Sn-i(3)  =  2*75 

L(l)  =  -6.77 

Vi(4)  ■  e-77 

L  (2)  =  -2.75 

W5)  - 00 

(1:1)  =  .7647 

=  Vi(5:1) 

i 

(2:1)  =  1 

(3:1)  =  .6364 

=  Vi(4:1) 

i 

(2:1)  =  2 

■  “H  (4:1) 

N-l 

{1:1)  “  hn-i(3;1) 

=  Hn.i(5:1)  =  0 

i 

(2:1)  =  2.7  5 

=  Vi(4:1) 

(1:1)  =  Vi(3:1) 

*  G  (5:1)  =  0 

N— X 

! 
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as  shown  in  figures  5.9,  5.10. 

Now  we  proceed  to  the  next  stage,  following  the  four  solution  steps 
of  solution  5.4. 


Step  1: 

Given  {<5  (1),6„  ,  (2)  ,  6  .(3),  <$„  .(4)}  from  section  5.2  and 

N-l  N-l  N-l  N-l 

the  form  transition  probability  discontinuities 


Vi2(D 


-1 


V12(2) 


+1 


we  can  obtain  the  composite  partition  of  x,  From  (C. 1. 1)- (C. l. 3) 

N-l 

and  (c« 1.6)  we  can  compute  the  conditional  expected  cost-to-go 


Vl(XN-lirN-2*1)  SS  Wel1*  W®  find  that  Vl(VlirN-2=1)  haS 


=7  pieces  with  boundaries 


W1' 


Vi<2> 


-  Vi(1> 


-  W2) 


-6.77 

-2.75 


V!(3) 

■  vi2(1)  - 

-1 

Vi(4) 

-V12(2)  - 

1 

Vi(5) 

-  W3)  - 

2.75 

Vi(6) 

*  W4’  - 

6.77 

and  that 


Vi(VilrN-2-l>  -  VlVllt)  +  VlVl(t)  +  N-l 

if  Vie  Vi(t) 

for  t=l, . . . 


with 


Vi(1)  s  Vi(7)  =  3,591 
Vi(2)  -  Vi(6)  - 3,65 


hn-i(2)  =  ,5 


hn-i(6)  -  -5 


Vi(2)  =  gn-i(6)  55  • 687  5 


Vi(3)  -  Vi(5)  = 3,559 


kn-i(4)  =  2,277 


V„  ,  (x„  ,  r  =1)  is  shown  in  figure  5.19. 
N- 1  N—  1  N—  2 


<o 


Step  2:  The  \J>„  ,  =7  constrained  JLQ  problems,  as  in  (5.28)  are 
"  N— 1 

thus 

Vj  +  l3-ssll76!1Vi 

N-l  — N-l  ' 


V2(V2-rK-2*1ll>  =  ”i“  . 

V2S‘t- 


e  a 


(i) 


V!(V!'r»-2‘lijl  "  ”i”  „ 

Vj  s-t- 


N-2 


+  3.65  x' 


N-l 


Vi€  Vi(2)Ui  Vi  +  *6875 


V2(V2'V2all31  “  t 

V2Slt’ 


Vie  Vi(3) 


u2t  „  +  (3. 5590909) X2 
N-2  N-l 


Vj'Vj'Vj-1!41  ‘  ..  "i“ 


V2S-t- 
Vie  Vi<4) 


u2  _  +  2.2772727  x„  , 
N-2  N-l 


V2(V2'V2*1I5)  ‘ 


min 
.s.  t. 


V2! 

Vie  Vi{5) 


UN-2  +  <3.5590909)*^1 


177 


V!1V!'V2-1IS' 


Vi3-* 

ViS-1141 


UN-2  *  3,65  *N-1 


-XX  +  .6874999 
2  N-l 


V2<V2'rN-2’1l7)  '  "in  „ 

Vi3'1' 


V2  *  <3- 5911765, 


€A„,(7) 


’Si-l6  AN-1 


Step  3:  Solving  these  constrained  problems  (using  the  formulae  of 


appendix  C.l)  we  find  that 


(  V1*^  -  .78219  x2 
I  N-2  N- 

/N-2CXN-2,1I1)  ")  1,R  P  2 

y  VN-2  Tn-2  +  13‘ 


if  x  <  -30.975976 

N“^  — 


537586  x  .  if  x  ,  >  -30.975976 

N—  A  N— £  — 


+  210.33327 


VN-2(XN-2'1  2) 


n-2  *[v:  13- 537586  Vil  i£  VjS- 

+  210.33327 


-31.223493 


,2,U  f  .7849462 


+.1075268 


XN-2  if  -31. 


+.674059 


223493<xN_2<r12. 53749 


>  p  2 

■  x  +5  499e 
1-2  XN-2 

_  +  34.478117 


4999994  x„  .  if  -12.537499  <  x„  „ 

N—  4  N* 4 


v2(V2'1I7) 


V2  1V2‘13- 537586 V2  lf  XN-2  —  30  ‘ 

=  \  *-+  210.33327  J 


975476 


*-■*■  210.33327 

VN-2  ■  •' 7321909  Vi 


if  x  _  >  30.975976 

N—  2  — 


That  is. 


V2«> 

S«-2(1) 


=  -31.22 


=  -30.98 


*  -9S-2I6> 
"  -V2l7> 


V2121 


V213’ 


-  -12.5*1 


-0S-2(5> 


-V2(6> 


W3’ 


-4.5S9 


-w51 


V214’ 


*  -3.277 


--®h_2<4) 


Step  4s 


Now  we  are  ready  to  compare  the  constrained  problem  costs, 
so  as  to  solve 


V2(XN-2'rN-2=1)  =  “ 


t=l, . . .  ,'1jn_1=7 


V2(V2'1lt)  ' 


(5.39 


as  in  (5.37) . 

In  figure  5.20  the  values  {0  _(t+l),  0,  _(t):  t=l,...,6}  are 

N-2  N— 2 

plotted  (on  the  x^_2  axis)  and  the  regions  of  x^_2  values  where  each 
candidate  cost  applies  are  indicated.  For  example,  when  x  is 
in  the  interval  (0„  ,  (3 ),9  _(4)),  the  eligible  candidates  are 

N— o  N“  £ 

{V1'R,V2,R,V3'R,V4'L,V5,L,V6,L,V7,L>;  note  that  all  of  the  eligible 


costs  over  this  interval  correspond  to  active  hedging  to  some  point. 


— I - i - 1 - 1 - 1 - 1 - 1 - 1 - 1 - 1 - 

-31.22  -30.98  -1254  -4.56  -3.277  0  3.277  4.56  12.54  30.98 

9N.2(2)  9n_2(3)  9n_2(4)  9n.2(5)  9n.2(6)  9n.2(7) 

eN.2(U  0n.2(2)  8n.2(3)  eN-2(4)  «n-2<5) 


Figure  5.20:  Eligible  Regions  of  x^  2  values  for  candidate 


31.22  x 

9n_2(6) 


costs-to-go  in  example  5.1 


A  "brute  force"  approach  to  solving  (5.39)  would  be  to  compute 

all  19  functions  of  x„  _  shown  in  figure  5.20,  and  then  to  compare 

N—2 

those  that  are  eligible  over  each  of  the  indicated  x^  2  intervals 

so  as  to  determine  which  is  optimal. 

Fortunately  we  can  avoid  many  of  these  calculations  and  computations 

from  a  consideration  of  the  shape  of  the  conditional  expected  cost-to- 

go  V  . (x„  . Ir  =1)  in  figure  5.19,  and  by  using  facts  (5. 31) - (5. 36) 

N— 1  N—l  N—2 

that  were  developed  in  the  proof  of  Proposition  5.1. 

Consider  V^'^(x  ,1)  and  V2'^(x  ,1)  as  functions  of  x  . 

N—  2  N—  2  N“  2  N-  2  N-  2 

Each  corresponds  to  driving  ^  to  the  value  Vi(1)  =  -6-750 

(from  the  right  or  from  the  left).  But  the  conditional  expected 

cost-to-go  V  ,  (x„  ,  r  =1)  is  continuous  at  v„  ■,  (D  (and  equals 
N-l  N—l  N-2  N—l 

1  R 

164.5,  as  shown  in  figure  5.19).  Thus  V  '  (x„  _,1)  equals 

N-2  N-2 

2  L 

V.  '  (x„  „,1)  as  a  function  of  x.  _.  The  same  is  true  for  each  pair  of 
N-2  N-2 - —  N-2 

functions  vf/  V^+i'L  that  correspond  to  driving  x„  .  to  a  point 
N-2  N—2  N—l 

,  (t+l)  where  V%T  -  (x„  _|l)  is  continuous.  That  is, 

N-l  N-l  N-l 


(driving  to  Y^d)) 

(driving  x^  to  YN_1(2)) 
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<> 


V2(XN-2'1}  3  V? 


(driving  to  YN_X  (5) ) 


V*'"<V2'1)  3 

N-2 


(driving  x^  to  YN_1(6))  . 


At  a  point  XN_1=Y  where  v^-i I ^  discontinuous,  the 

cost-to-go  that  corresponds  to  driving  to  the  side  of  x  =y  where 

N-l 

^(x^  ^|l)  is  less  is  obviously  lower  than  the  cost  of  driving  to 

the  more  expensive  side  of  x  ,  .  =Y  •  Thus  as  f unctions  of  xv,  _ , 

N—  1  N—  2 


vJ:J(V2'1><  vm<V2'11 


(best  side  is  to  Y„  ,  (3)+=  -1+) 
N“  1 

(  best  side  is  to  y„  , (4)  =  1  ) 

N“1 


Using  the  above  relationships:  and  (5. 31)- (5. 36)  ws  can  eliminate 


^  1 L  TT3,R  .^5 ;  L  ..5  /  R  ..6  f  L  __6/R  ^  L i  c  .  «  .  .  , 

tVN-2'  Vi'  V-2'  Vi’  VN-2'  V2'  VN-2-  VN-2}  fr°“  c““d“»t““  by 


the  following  steps  (each  of  which  is  indicated  on  figure  5.21): 


1.  As  functions  of  x^_  , 

'&X-2'1’  5  '&2lV2'1)>  VN-2(V2'1>  at 

Vl  -  V2lJ"-  ®1US 

1  R 

V  '  cannot  be  optimal  for  x  €(6  ^  • 


FIGURE  5.21:  Eliminating  candidate  costs; (n) indicates  that  the  specified  costs  is  eliminated 
over  this  interval  of  x  values  by  step  1  in  the  text. 


V„  -  cannot  be  optimal  for  x  _  >  0„  ^ (2)  because 
N-2  N— 2  N-2 

VN-2lXN-2'1>  3  v«lV2'11- 


i7N'2  cannot  be  optimal  for  x^^  <  ®n-2^2^  ^cause 


V2,L  =  v1,R  >  V1,U  . 
N-2  N-2  N-2 


V  '  cannot  be  optimal  for  x  6(8  , (3),0  (3)) 

N-2  N-2  N-2  N-2 

because  >  V^°  . 


V  '  cannot  be  optimal  for  x  >  0„  _(3)  because 

N— 2  N— 2  N— 2 

i-l  =  i'-l  >  Vl-l  £“  V2>3«-2(3)- 


V  ' _  cannot  be  optimal  over  x„  _  <  8„  _ (2)  because 
N—  2  N—  2  N-2 

^2  =  VN-2  >  V^2  V2<V2<2>- 


cannot  be  optimal  for  x  6(8  - (2) ,0  (2) ) 

N— 2  N— 2  N— 2  N-2 


V3'L  =  v2'R  >  v2'D  . 
N-2  N-2  N-2 


V  '  cannot  be  optimal  for  x  6(0  _ (3) ,6  (4)) 

N—  2  N—  2  N—  2  N—  2 


because  V2'-  >  . 


VN^2  cannot  be  optimal  for  *N_2  e(0„  ^ (4) ,0„  „(4) ) 


N-2  "N-2 


b8CaUSe  VN-2  >  VS-2  5  VH-2  ' 


3,S 


Vm_  -  cannot  be  optimal  for  x ,  _  >  .(4) 

N-  2  N—  2  N—  2 


because  >  V4,!i*  and  V4f^  >  v4f^  for  x  >  0 

1130  N-2  N-2  N-2  N-2  N-2 


VS,L 

N-2 

cannot 

be  optimal 

f°r  Vi  <  V2 

V5'o 

>  v4'* 

and  V4'^ 

>  V4'^  for  x 

N-2 

N-2 

N-2 

N-2  N- 

v5'L 

N-2 

cannot 

be  optimal 

f°r  XN-2  €(9N-2 

N-2 


because  >  V*'*  >  ”4'U 


N-2  N-2  N-2  . 


»5,L 


vr.'  cannot  be  optimal  for  xvt  €(0xt  _  (4)  ,8  (5)) 

N— 2  N-2  N-2  N-2 


because  >  v4'^  . 

N-2  N-2 


,5,R 


Vn^2  cannot  be  optimal  for  xN_2  ®(®N-2  ^ '®N-2*6^ 


,  „5,R  _  6,L  .  6,U 

because  VN_2  =  VN_2  >  VN_2  . 


5  R 

V  '  cannot  be  optimal  for  x  >  0  _(6)  because 


We  see  from  figure  5.21  that 


*  VN-2  iS  °ptimai  f°r  ®N-2(4)-  *N-2  -  0N-2(4) 

*  VN-2  13  °ptimal  for  0n-2(3)-  XN-2  -  9N-2(4) 

*  Vjj-2  13  °Ptimal  for  0N-2(4)-  YN-2  -  6N-2{5)  • 

To  complete  the  minimization  of  (5.39),  we  first  solve  for  the 

intersections  of  V4'^(x„  „,1)  and  V3'^(x,  „,1).  We  find  that 

N—  2  N— 2  N— 2  N— 2 

they  intersect  at 

*N_2  *  -6.977,  -2.1414 

and  that  for  x„  _  <  -6.977  V4'^  >  vf'  °  .  Thus 

N-2  N-2  N-2 

3,  U 

*  V  .  is  optimal  for 

V2<3|i  v2i-6-977 

*  V4'^  is  optimal  for 

N-2 

-6-977  <  V2i  V2(4)- 

In  addition, 

4  L  2  ^ 

•  V.  '  doesn't  interesect  V  ' _ 


V  -  doesn't  intersect  V  . 
N-2  N-2 


in  (-“,  0  (1) ) 

N-2 


Thus  to  complete  the  determination  of  V  2 2,rN-2=1^  for 

2  0  10 

x.T  _  <  0,  we  need  only  to  find  the  intersections  of  V  and  V  ' 
N-z  —  N-2  N-2 

These  occur  at 


x  „  =  -31.18,  -7.846 
N— 2 


“d  f0r  V2  <  *31'18  '  VM-2  <  VN-“ 


Thus  from  figure  5.21  we  see  that 


1,  U 

•  VN_2  is  optimal  for 


XN-2±-31-18 


^  2,  U 

•  V  _  is  optimal  for 

N-2 

-31.18  <  XM  <  -12.54  =  0  (3) 

—  N-2  “  N-2 


From  the  symmetry  of  this  problem  we  need  only  consider  x  0  <  0 

N— 2  — 

or  xN_2  0  (this  is  easily  verified  from  the  vn_2^xn-2,11  ^  comPuted 
above) .  Collecting  all  of  the  above  information  we  thus  have 


the  following: 


KN-2(1:1) 

=  .7821909 

-  v2(9:1) 

V2(2;1) 

=  .7849462 

-  KN-2(8:1) 

v2(3:1) 

-  .780658 

-  V2(7s1) 

V2(4:1) 

=  1 

-  v2(6:1) 

v2(5:1) 

=  .6948682 

V2(2s1) 

=  .1075268 

=-Hn.2(8:1) 

H  - (4:1)  =  2 


-H  (6:1) 


N-2(2s1) 

— 

.674059 

— 

GN-2(8:1) 

N-2 (4 5 15 

= 

3.2772727 

= 

GN-2(6:1) 

N-2(l!l) 

S3 

GN-2(3:1) 

— 

GN-2(5:1)  =  GN-2(7:1)  =  GN-2(9:1)  =  ° 

'N-2(1:1) 

= 

.7821909 

LN-2(9!l) 

'N-2(2!l) 

= 

.7849462 

LN-2(8J1) 

‘n-2(3:1) 

= 

.780658 

= 

V2(7:1) 

•N-2(45l) 

= 

1 

* 

W6j1) 

'n-2  (5 :1) 

= 

. 6948682 

N-2(2:1) 

a 

-.0537634 

-FN-2(8!l) 

N-2(4:1) 

SI 

-1 

= 

-FN-2(6!l) 

N-2 


(lil)  -  F  _  (3 : 1)  *  F  ,(5:1)  -  F„  ,(7:1)  =  F„  ,(9:1)  =  0 


N-2 


N-2 


N-2 


N-2' 


The  value  of  the  x„  .  obt  led  by  application  of  the  optimal 
N- 1 


control  laws  is  given  by 


/ 


Vl(V2'rN-2' 


•1) 


2178091  x  , 

N— a 

if 

W11 

2150538xn_2  -.0537634 

if 

Va(l,iV£ 

V2(2> 

219342  V2 

if 

6N-2<3) 

1+ 

if 

V2<3liV2i 

V2(4) 

305138  V2 

if 

V2(4>iV2i 

WS> 

•  r 

if 

V2<5,iV2i 

W61 

219342  x 

N-Z 

if 

V2(7> 

2150538x^2+  .0537634 

if 

V2(7>iV2i 

V2(S> 

2178091  x  0 

if 

V2(8li  Vi 

(5.43) 


FIGURE 


-31.18  -12-54-6.98-3.28 

8'n.2(1)  8’n_2*(3) 

8‘N-2*(2)  8,n.2=  (4) 


Hedging 
to  *N-r",+ 


3.28  6  9812.54  31.18 

8n-2=(5)  8Jg_2l7) 

S'n-25^  8n-2=(8) 

Hedging 
to  *N-f*r 


XN- 


. 22;  Optimal  expected  cost-to-go  from  (xN_2> rN_2=l) 
in  example  5.1.  (not  drawn  to  scale). 
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8n_2(D 


M2.5t-fi.M-3.3S  >s£3S  6.S8 12-54 

I  *2'3  '  \  I  ' 

&n.2(3) 

I  -5  45  \v 

-M2  '\ 

-9.79-  I  X 

S*n.2(2)  8*^4)  8'n.2(5)I  |\ 

ill  I  I  K*vr 


8‘n]2(8) 


I--24.4I 


Hedging  to  |  Hedging  to 
XN-1S_I*  *N-1:1 


FIGURE  5.23:  Optimal  control  law  from  ^N_2,rN-2=1)  in  examPle  5.1 
(not  drawn  to  scale) . 


4.  There  are  four  regions  of  x^  avoidance: 


:N_L  $  (-6.  759,-6.791) 

Vi  *  (1'  1*53) 

:N_!  $  (“1-53,  -1) 

xN_1  $  (+6.759,  6.791) 

As  the  previous  stage  these  regions  of  avoidance  correspond 

to  xm  o  values  where  Vv1  _(xM  _,r  =1)  is  not  differentiable. 

N-2  N-2  N-2  N-2 

That  is,  to 


{(V2(1)'  SN-2<3)  '  5„-2(51'  V2{7’> 


Note  that  there  is  no  hedging- to- a- point  (from  time  N-2) 

associated  with  6,  „(1)  and  5,  „(7). 

N-2  N-2 


5.  In  the  determination  of  V  _(x„  „,r  =1)  above,  we 

N-2  N— 2  N-2 

did  not  have  to  compute  and  compare  many  of  the  quadratic 
functions  listed  in  (5.38). 


The  five  phenomena  listed  above  are  examined  for  the  general  problem 
in  the  next  section,  and  are  characterized  by  Propositions  5.2,  5.3 
and  Corollary  5.4.  From  consideration  of  both  stages  of  this  example, 
we  can  make  the  following  additional  observations  and  claims  that 
are  addressed  in  the  next  chapters: 


The  boundaries  of  the  left  and  right  endpieces  are 
much  further  from  zero  at  time  N-2  than  at  time  N-l: 


iN-2(1) 

=  -6.75  = 

-Vi 

;N-2(1) 

=  -31.2  - 

-V: 

(4) 

(8) 


This  is  an  example  of  a  general  property:  except  for 
pathological  problems,  the  range  of  x^  values  for 
which  the  optimal  controller  involves  changing  the  pro¬ 
bability  piece  that  the  state  will  be  in  (at  some  time 
{k+1,  k+2, . . . ,N-1,N})  monotonely  increases  as  (N-k) 
increases.  That  is,  the  endpieces  move  "further  out" 
from  zero  as  (N-k)  increases. 

The  size  of  the  middle  piece ,  where  active  hedging 
serves  no  useful  purpose,  also  grows  between  k=N-l 
and  k=N-2,  but  more  slowly  than  the  distance  to  the 
end  pieces: 

.  at  time  k=N-l,  the  middle  piece  is 
(<5„  ,  (2),6„  i <3>  =  (-2.75,  2.75) 

N—  X  N— 1 

.  at  time  k-N-2,  the  middle  piece  is 

(V2(4)'W5)  =  ("3-3,  3,3) 

This  suggests  a  general  property:  as  (N-k)  increases, 
the  sizes  of  the  middle  pieces  converge  monotonely 


(increasing  or  decreasing)  to  steady-state  values. 


8.  Note  that  the  curves 


v1'0  =  V7'u  and 
N-2  N-2 


V3'U  =  v5'u 

N-2  N-2 


«ure  very  close  together  (see  (5.41));  in  fact,  they 

are  so  close  that  figure  5.22  could  not  be  drawn  to 

scale  and  still  show  the  behavior  of  V  _  (xv,  .r  =1) 

N—  2  N—  2  N—  2 

at  its  joining  points.  This  suggests  that  even 

lWj) 

increases  as  (N-k)  increases,  many  of  them  may  be 
"almost  the  same."  This  phenomena  is  the  basis  of  a 
"finite  look- ahead"  approximation  to  the  optimal  steady- 
state  (infinite  time  horizon)  solution  of  the  general 
problem,  which  is  developed  in  chapter  7. 

Setting  aside  for  now  the  steady-state  phenomena  (6)- 
(8)  above,  we  proceed  to  clarify  the  combinatoric  pro¬ 
perties  (l)-(5),  in  the  following  section. 


though  the  number  of  pieces  m^Cj)  of 


5.6  Some  Combinatoric  and  Qualitative  Issues 

In  this  section  we  examine  several  combinatoric  and  qualitative 
issues  related  to  the  (off-line)  determination  of  the  optimal  control 
laws  and  costs  of  Proposition  5.1.  Aspects  of  the  problem  that  are 
addressed  here  include; 


.  the  nature  of  active  hedging;  examining  what 
values  of  x  an  optimal  controller  will  hedge 
to  and  why,  and  what  values  of  x  will  be  avoided 
and  why, 

.  determining  how  many  of  the  candidate  costs 
(and  control  laws)  in  (5. 38)  must  actually 
be  computed  and  compared, 

.  characterizing  the  number  of  pieces,  m^fj) 
of  the  optimal  expected  cost  V  (x  , r  - j ) 

K  1C  JC 

and  control  law  u^(x^,r^«j). 

The  topics  studied  here  are  useful  in  the  specification  of  an 
efficient  way  to  carry  out  the  algorithm  steps  that  are  indicated 
in  the  proof  of  Proposition  5.1. 


A  "brute  force"  way  of  determining  (x^,r  =j)  in  (5.37)  is 
to  compute  and  compare  all  of  the 

^k+l"2  =  1+3|^J  [{vji(t):t=l,...,Vji-l}U{63+1(t):t=l,...,mk+1(i)-l}| 

3  (5.44) 

candidate  quadratic  cost  functions  listed  in  (5.38)  (the  right  hand  side 


of  (5.44)  follows  from  (5.23)).  Thus 


3* 


j 

k+1 


-2 


<1*3 


(5.45) 


where  equality  in  (5.45)  corresponds  to  the  "worst  case" 

jc.,1  =  M  all  forms  are  accessible  to  each 
3 

other  in  one  step 


and  all  the  V , . (£) 
ji 

5k+l(t) 

• • • /in. (i) “1 

i=l, . . .  ,M 

values  are  different. 


This  suggests  that  the  number  of  pieces,  m^(j),  °f  each 
V^lx^r  =j]  might  be  growing  geometrically  (with  powers  of  3)  as 
N-k  increases.  Fortunately,  this  is  not  the  case,  as  suggested  in 
the  previous  section.  The  underlying  reason  that  many  of  the 
candidate  costs  in  (5.38)  can  be  discarded  is  the  nonincreasing- slope 
condition  (3)  of  Proposition  5.1.  In  particular,  the  optimal  controller 
only  actively  hedges  to  x  values  that  are  discontinuous  points  of  form 
transition  probabilities  (ie,  to  Vs).  There  is  no  active  hedging 
to  joining  points  of  the  (next-stage  forward)  expected  costs  (ie,  to 
<5 ' s)  precisely  because  the  slope  of  these  costs  is  nonincreasing  at 
such  points. 

These  facts  will  be  established  as  we  pursue  the  following: 

(1)  first  we  show  that  many  of  the  candidate  costs 


in  (5.38)  cannot  be  optimal  (for  any  x^  value) 

and  hence  they  need  not  be  computed  (Proposition  5.2), 


(2)  Next  we  show  that  each  candidate  cost  in 


(5.38)  can  be  optimal  over,  at  most,  a 
single  interval  of  x^  values.  This  bounds 
the  number  of  pieces  m^Cj)  V^x^r  =j). 
(Proposition  5.3  and  Corollary  5.4). 


The  following  proposition  relates  values  of  xk+1  that  are  hedged 

a 

to  with  discontinuities  of  the  expected  cost-to-go  vjc+j_  ' 

and  it  eliminates  many  of  the  candidate  costs  in  (5.38)  from 
consideration . 


Proposition  5 . 2 

The  optimal  control  law  U]c^\,rjc=^  can  onlY  hedge  to  points 
that  are  discontinuities  of  the  conditional  expected  cost-to-go 

Vi(!WVj)-  Ttat  ls 


(hedging 
to  some 

‘vv3 


I  point  x  from 


Only 


Sc+l  =  \+l  + 


x  is  a  discontinuous  point 
of  form  transition  probability 
p( j » i;x) 

for  some  i  €  C. 

3 


LJ  {v..  (A)  :  A-l . V..-1) 

iSC.  ^  jx 


j 


(5.46) 


of  the  candidate  costs  listed  in  (5.38)  must  actually  be  computed 
and  compared  in  (5.37)  so  as  to  determine  V^(x^,r^=j). 


These  costs  are: 


(i)  for  each  x^+1  region  A^+1(t),  t=l, . . .  ,ij£+1. 


,t,U, 


the  "unconstrained"  cost  v'  (x,  ,j) 

k  k 


(ii)  for  each  form  transition  probability  discontinuity 


v..(£)  (for  iec.»  1=1, . . . , V . . -1) ,  which  is  denoted 
Di  D  Di 


by  y^+  (t)  for  some  te{l, . . .  ,i|i^+1“l},  we  must 


consider 


•  the  "left  constrained"  cost  V^+1,L(X]C,3^ 


if 


,t,R, 


•  the  "right  constrained"  cost  (x^, j) . 
is 


Vi  <  [^*1 lt)  1 +1 J : 1 >  Vi 1 1  !lrk+i (t)  r  1 = > 


□ 


This  proposition  is  proved  in  Appendix  C.3.  The  proof  follows  from 


certain  relationships  between  the  relative  values  of  9^ (t)  and 


0,j  (t) 

k 


From  Proposition  5.2  we  know  that  the  mapping 


VilW)> 

need  not  be  one-to-one,  in  that  hedging  to  points  may  occur. 

The  following  proposition  lists  a  number  of  general  qualitative 
properties  of  the  optimal  controller  that  are  suggested  by  example  5.1 
In  particular,  it  characterizes  the  behavior  of  the  x^  •  x^+^  ) 


mapping . 


Proposition  5,3: 

The  optimal  controller  of  Proposition  5.1  has  the  following 
properties: 


At  each  time  k  and  in  each  form  j6M, between  joininc 
points  {5^(t):  t=l, . . .  /m^(  j)-l}  of  V^tx^/r^j): 


\(Xk'Vj)  =  2a  ( j )  R(  jT 


3Vk(Xk'Vj) 


(5.47) 


J2  .  . ,  3V  (x.  ,r  =j) 

b  (i )  k  k  k  . 

VlW’1  -  a(3)3Ik  •  5^ 


(5.48) 


(here  a(j)R(j)/0,  b(j)^O). 


At  those  joining  points  6  where  the  slope  of  Vjc^x]c'rjc®3) 

/  *VWj)  \ 

does  not  change  fie,  - -  exists  j  , 


exists  I  , 


Wrk“j)  and  xk+l(xk'rk=j)  are  continuous 

functions  of  x  . 

1c 

At  those  joining  points  5^(t)  where  the  slope  of  vy 
decreases  dis continuously 


svwj) 


8Vvv?i 


\(x  ,  r  =i)  increases  discontinuously  at  6 

k  k  - 

when  b( j)  _  (and  decreases  discontinuously 

a(j) 

at  6  when  b( j)  <  0) 
a  ( i )  / 
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(ii)  the  mapping  x^ 

discontinuously  at  5  when  a(j)  >  0)  (and  decreases 
discontinuous ly  at  6  when  a(j)<  0)  , 

The  mapping 

*k  *—  Vi‘W3) 

has  the  following  properties: 


*k+l(Xk.'rk=  j)  increases 


(i)  the  mapping  is  monotonely  nondecreasing  if 
a  ( j )  >0  (and  monotonely  nonincreasing  if 
a(j)<  0)  for  each  j€M 


(ii)  it  consists  of  m^j)  line  segments: 


.  one  line  segment  with  positive  slope  if 
a(j)>  0  (negative  slope  if  a(j)<  0)  for 
each  x  region  where  an  "unconstrained  cost" 


V^'U(x  ,r  =1)  is  optimal 

Jv  K  K. 


.t,ui 


Wrk=j)  =  vk'  (Wj) 


te{i,...,i|£+1> 


x  =  f _ _ 1 

k+1  L  R(j)+b2(j)Kj+1(t)J 


x.  - 


b2(j)S^i(t) 


2[R(j)+b2(j)K^+1 


(t)] 


a  constant  line  segment  for  each  x^  region 


(5.49) 


(iii)  there  are  regions  of  x^+^  avoidance  associated 
with  (and  only  with)  each  x^S  value  where 

»rk=j)  decreases  discontinuous ly. 

(5)  Each  candidate  linear  control  law  (associated  with  the  costs 
listed  in  (5.38))  can  be  optimal  over,  at  most,  a  single 
interval  of  values. 

C 

The  proof  of  this  appears  in  Appendix  C.4,  and  it  will  be  verified 
at  the  end  of  this  section  for  example  5.1. 

Proposition  5 . 2  restricts  the  number  of  candidate  costs  that 
must  be  considered  in  (5.37),  and  fact  (5)  of  Proposition  5.3  says 
that  each  candidate  can  be  optimal  over  at  most  one  interval. 

Thus  we  immediately  have  : 


the  slope  of  V^(x^ 


Corollary  5.4: 

The  number  of  pieces  of  the  optimal  expected  costs-to-go 
V  (x^^jj^j)  and  their  associated  control  laws  are  bounded  above  by 


Vj>  -4+i=  *iLi  +  a-1 . v^-Ul 

1  i€C . 


31 


31 


(5.50) 


A  weaker  bound  which  follows  from  (5.24)  is 


\<j)l  Cj!+1  i  1+  I  (mk+l(i)_1)+  2  ^  {v^U):  1=1,...  ,V  -1} 


Note  that  in  (5.50)  the  factor  of  3  in  (5.45)  is  eliminated. 
Corollary  5.4  says  that  the  number  of  pieces  in  each  optimal 
expected  cost  V^(x^,r^_=j) ,  grows 

.  at  most  linearly  with  the  number  of  transition 
probability  pieces 

.  at  most  geometrically  with  the  number  of  elements 
of  c_.;  that  is,  the  number  of  forms  accessible  from 
j  in  one  time  step. 

Suppose  that  the  piecewise-constant  form  transition  probabilities 
in  (5.3)  are  approximations  of  the  true  probabilities.  From  (5.50)- 
(5.51)  we  see  that  there  is  a  tradeoff  between 


the  accuracy  of  p(j,i:X)  approximations  (in  terms 
of  the  number  of  pieces  v  that  are  used) 


versus 


.  the  complexity  of 

.  the  algorithm  computations 
(in  terms  of  C^+^) 

and 

.  the  resulting  controller 

(in  terms  of  the  number  of  Vk(x^,r^=j) 
and  u^x^r  *j)  pieces,  m^(j)). 

We  conclude  this  section  by  applying  the  Propositions  and  Corollaries 


developed  here  to  example  5.1. 


Example  5.1,  continued; 

We  have  already  seen  that  hedging- to- a- point  from  (xv  .r  =1 

N- 1  N*~  1 

and  from  (x  _  „ , r..  _=1)  is  only  to  the  discontinuities  of  the  form 
N— 2  N-  2  -  - 

transition  probabilities  (1) = -1  and  V12(2)-+l  (see  (5.19)  and 
(5.43)). 

Since  for  this  example 

|LJ  {V1±(4)  :  &=1, 2}  |  =2, 

iec 

V3,  Vi=7  ' 

the  number  of  candidate  costs  that  actually  had  to  be 
computed  and  compared  (according  to  (5.46)  of  Proposition 
5.2)  was 

5 


Si-) 


From  the  shapes  of  V (x  r  =1)  (figure  5.12)  and  V  (x  r  =1 

N  N  N- 1  N- 1  N- 1  N-  2 


(figure  5.19),  Proposition  5.2  specifies  that  these  candidates  are 


for 


„1,U  2 , L  2,U 

VN-1'  VN-1'  VN-1  ' 


Vi(Vi’rN-isl) 


/2'R,  v3'u 
N-l  N-l 


(5.52) 


for 


VN-2(XN-2,rN-2! 


*1) 


( V1'U,  v2'°  V3,u 
it  N-2  N-2 '  N-2  J 


fQ  C 


Eligible  costs  for  V  (xm  ,,r  =1)  in  example  5.1. 

N“1  N-  X  N- 1 

The  x' s  in  the  figure  indicate  candidate  costs  that  are 
eliminated  from  consideration  in  (5.37)  by  Proposition  5.2 


1  R  j 

Thus  we  see  that  we  did  not  have  to  compute  V  ' ,  and 

N-  i  N-l 


in  Section  5.3.  The  application  of  Proposition  5.2  for  V  (x  ,,r  =1) 

N-l  N-l  N-l 


is  shown  pictorially  in  figure  5.25.  The  candidate  costs  listed  in 

(5.53)  are  precisely  those  that  we  found  we  had  to  compute  in  section 

5.5.  We  have  already  showd-  that  Proposition  3.2,  3.3  and  Corollary  3.4 

hold  for  this  problem  at  k=N-l,  k=N-2.  Note  that  the  bound  (5.50) 

in  Corollary  5.4  holds  with  equality,  and 

m  (1)<  1  +  4k 
N— k  — 

follows  directly. 


5.7  Summary 

In  this  chapter  we  have  considered  a  class  of  nonlinear  stochastic 
JLQ  control  problems  and  have  developed  a  procedure  for  their  solution. 
The  basic  idea  of  this  solution  procedure  is  simple  and  the  solution  form 
is  conceptually  straightforward  (although  the  notation  required  becomes 
quite  complex) . 

We  have  identified  some  basic  properties  of  the  problem  that 
reduce  the  combinatorics  involved  in  the  solution  procedure.  These 
facts  (and  others  to  be  developed)  will  be  exploited  in  the  construction 
of  an  efficient  solution  algorithm  in  chapter  7. 


1 


In  the  previous  section. 
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We  have  also  identified  some  basic  qualitative  properties  of 
the  optimal  controller.  These  include  hedging- to-a-point,  regions 
of  avoidances,  and  endpieces  and  middlepieces  of  the  expected  costs- 
to-go  and  control  laws. 

From  analysis  of  the  optimal  controllers  developed  here  we 
can  gain  insight  into  the  structure  and  nature  of  controllers  that 
use  active  hedging.  In  chapters  6  and  7  we  will  continue  our  exami¬ 
nation  of  the  qualitative  properties  of  these  controllers .  In  parti¬ 
cular,  the  steady-state  behavior  of  the  infinite  time  horizon  problem 


is  examined . 


QUALITATIVE  PROPERTIES  OF  THE  SCALAR  X-DEPENDENT  JLQ  CONTROLLER 


6.1  Introduction 

In  this  chapter  we  consider  certain  qualitative  properties  of  the 
optimal  JLQ  controller  of  chapter  5,  as  the  number  of  stages  (N-k) 
from  the  terminal  time  increases.  We  will  restrict  our  attention  to 
JLQ  problems  like  those  of  chapter  5,  but  for  simplicity  we  make  the 
additional  assumptions  that 


(1) 

P(j)  -  GT(j)-0 

(6.1) 

(2) 

S(j)  =  HT(j)=0 

(6.2) 

(3) 

a ( j) >  0 

(6.3) 

(4) 

Q  ( j )  >  0 

(6.4) 

for  all  j  €  M. 

We  begin  in  sections  6.2  and  6.3  by  examining  the  behavior  of  the 
optimal  control  laws  and  expected  costs- to-go  when  x  is  far  from  zero 
("endpieces")  and  when  x  is  near  zero  ("middlepieces") .  Over  these 
regions  of  x  values,  (x^/r^j)  can  be  computed  from  sets  of  recursive 
difference  equations  without  carrying  out  all  of  the  steps  of  section  5.4. 
The  equations  specifying  these  endpieces  and  middlepieces  of  the  optimal 
controller  are  the  same  as  those  that  solve  certain  corresponding 
x- independent  JLQ  problems  (as  in  chapter  3) . 

In  section  6.4  we  obtain  upper  and  lower  bounds  on  the  costs 
Vk(x^,r^*j)  when  x^  is  between  these  endpiece  and  middlepiece  regions. 

When  the  system  is  stabilizable  in  each  form  j  €  M,  the  difference  equations 


describing  these  bounds  converge  to  steady-state  values.  These  bounds 
can  themselves  be  bounded  by  certain  x- independent  JLQ  problems  (of 
chapter  3) .  From  this  fact  we  obtain  sufficient  (but  not  necessary) 
conditions  for  the  upper  and  lower  bounds  on  (x^,r^=j)  to  converge  to 
steady-state  values  when  not  all  of  the  forms  have  stabilizable  dynamics. 

In  sections  6. 5  and  6.6  we  illustrate  certain  fundamental  qual¬ 
itative  properties  of  the  optimal  JLQ  controller.  We  do  this  by  exploring 
a  particular  class  of  problems  in  greater  detail.  Specifically  we  examine 
the  parametric  dependence  of 

•  hedging  regions:  these  are  intervals  of  x  values  from 
which  the  optimal  controller  hedges  to  a  point; 
specifically,  the  best  strategy  from  such  an  x  is  to 
use  the  control  to  drive  the  system  into  a  different 
piece  of  the  form  transition  probabilities. 

•  regions  of  avoidance:  these  cure  x  values  that  the 
optimal  controller  keeps  the  system  away  from. 

•  the  stability  properties  of  the  closed  loop  optimally 
controlled  system  over  different  pieces  (of  x  values) . 

•  the  existence  of  local  minima  in  the  expected  costs- 


In  chapter  7  we  will  present  a  solution  algorithm  that  uses  the  results 

'  .  .  /*  - 

of  this  chapter  and  chapter  5  to  eliminate  many  of  the  computations 
specified  by  the  section  5.4  solution  procedure.  We  will  also  use  the 
problem  class  discussed  in  sections  6. 5-6. 6  as  a  vehicle  for  exploring 
additional  qualitative  properties  of  the  controller. 


6.2  Endpieces  of  the 


timal  Controller 


In  this  section  we  study  the  endpieces*  of  (x^r^i)  and 

vwj): 


V?xk'j)'  u?Vj)  for  \  -  5k(1) 


^Vj)'  \e(xk'j)  for  \  -  5k(mk(j>“1) 


(for  each  j€M) 


The  basic  results  of  this  section  are  as  follows: 

(1)  for  finite  time  horizon  problems,  if  x^  is  negative 
enough  or  positive  enough  the  optimal  strategy  is  to 
keep  x  in  the  same  extreme  x-pieces  of  the  form  transi¬ 
tion  probabilities  p(j,i:x)  for  all  ie  j  (from  each 
j  €  M)  for  all  future  times. 


That  is,  the  controls 


Le'  denotes  "left  endpiece"  and  'Re'  denotes  "right  endpiece 


keep  x]t+i/"--»xN  in  the  same  extreme  (i.e.,  far  from  zero)  piece  of  the 
form  transition  probability. 

For  these  extreme  x^_  values  the  x-dependent  JLQ  control  problem  of 
chapter  5  reduces  to  an  x-independent  one.  The  optimal  expected  costs- 
to-go  and  control  laws  (in  each  j  6  M)  for  these  endpieces  can  be  computed 
off-line  via  a  set  of  M  coupled  recursive  difference  equations  (one  set 
for  the  left  endpieces  and  one  for  the  right  endpieces) .  Thus  the  end- 
piece  functions  can  be  computed  without  following  all  of  the  steps  of 
section  5.4. 

(2)  For  infinite  time-horizon  problems,  as  (N-k)-*»  these 
endpieces  of  the  costs-to-go  and  control  laws  converge 
to  steady-state  (constant  parameter)  functions  of  x  if 
the  dynamics  in  each  form  are  stabilizable  (i.e., 
b(j)  ft  o  or  |  a  <  j )  |<1)  . 


(3)  In  general  the  range  of  x^  values  between  these  endpieces 
becomes  infinite  as  (N-k)-+co.  The  width  of  x^  between 
the  endpieces  of  Vk(x^,r^  ®  3)  remains  finite  if,  once 
the  system  is  in  form  j,  it  cannot  be  in  any1  form  having 
x-dependent  form  transition  (exit)  probabilities  for 
more  than  one  time  step. 


Fact  (2)  is  well  known  from  the  LQ  case.  Facts  (1)  and  (3)  are  proved 
in  Proposition  6.1  and  Proposition  6.3  respectively. 

1  Including  (possibly)  j  itself. 


L5 


The  following  proposition  lists  the  equations  for  the  left  and 


right  endpieces.  It  is  stated  for  the  general  JLQ  controller  of 
Proposition  5.1  (with  P(j),  G^j),  S(j),  H^tj)  not  necessarily  zero) 
However  to  simplify  notation  we  assume  that 


a(j)  >  0  j  6  M  (6 
and  we  will  exclude  problems  where  the  system  "just  coasts"  in  some 
form  j  with  u^x^r^JEO  by  requiring 


/<2(j) 

\S(j)/2 


S (j)/2 
P  ( j) 


/vj) 

\.HT(j)/2 


Vj)/2 

GT(j) 


>  0  . 


V  J 


e  m. 


(6 


Proposition  6.1  (Endpieces) 

Consider  the  JLQ  problem  of  Proposition  5.1,  where  (6. 7) -(6. 8) 

hold. 


(1)  For  x^  <_  6^(1),  the  optimal  control  laws  and 
expected  costs-to-go  are 


VW3’  *  \  (vj) 

-  v?! (V3)  -  \Kf(3)  *  V?(j)  *  Gr(3> 


(€ 


WVj)  =  \'U(Xk'j> 


A  Le, 

=  \  <V3)  = 


.  Le 


„Le  , 


-v(j)\  +  pk 


(€ 


(2)  For  x^  >  o^(m^(j)-l) ,  the  optimal  expected  costs- 


to-go  and  control  laws  are 


Vvv^  =  v  <vj) 


A  .Re  .  . ,  2  Re  ...  ,  „Re  ...  ,  _Re  . . , 

■  \  <V3)  '  Vk  (3)  +  *A  l3)  *  \  (3) 


,  , .  Kti  . 

WV3)  ■  \  <V3) 


A  Re ,  . .  TRe,..  _Re , . . 

■  \  <v3)  *  (3)!!k  +  rk  (3) 


(3)  The  parameters  in  (6. 9) -(6. 12)  are  computed  recursively 
backwards  in  time  from  N  by 

^  3  R(j)+bJ(j)^1(j) 


k  3  R(3)+b2(J>>^1<3} 

2  2 

Le...  =  eLe  ...  b 

^  ^k+1  ^  2  ^Le 

k  k+1  4[R(j)+b^(j)K^1(j)] 


where 


^i<3>  - vi)[^i(i,ta<il1 


with  terminal  conditions 


*S*U>  -  K^<j>  =  KT(j) 

H^<j>  -  C «>  -  Vj) 

<3^<j)  -  qt(3)  ■ 

The  control  law  gains  are 

a(j)b(j)K^1(j) 

R(j)+b2(j)^1(j) 

2[R(j)+b2(j)K^1(j)] 

a(j)b(j)i^®1(j) 
R(j)+b2(j)K^1(j) 


Fk6(j)  * 

T  Re  . . . 

\  l3)  = 


2[R(j)+b2(j)^J1(j)] 


(€ 

(€ 

(€ 


(€ 


(€ 


(€ 


□ 


(€ 


Proof  (sketch) :  Recall  that  each  form  transition  probability  p(j,i) 
is  piecewise-constant  in  x  with  v . .  pieces  : 


x  eiv^u-i) ,  v^u)) 


i=l , • . . , V . , -1 

where  V .  .  (0)  =  -  00  ,  V .  .  (V . .  )=  +  00  . 
31  3i  31 


It  is  clear  that  for  x^  negative  enough  we  will  have 


x  <  v. . (1) 

k+Jl  ji 


Vi#  j  e  M 


(6.32 


and  for  x^  positive  enough  we  will  have 


v.  =■  W11 


Vi#  j  e  M 


(6.33 


for  &=l,..,N-k  (here  N<°°)  .  This  •  is  verified  in  Appendix  C.5.  Thus 
Proposition  6.1  is  just  a  restatement  of  the  x- independent  JLQ  solution 
(Prop.  3.1)#  where  we  make  the  identifications 


for  left 
endpieces : 


P..  =  X.. (1) 
ji  3i 


for  right 
endpieces : 


P. .  =  X. . (V. .) 
31  31  3i 


Vi,  j  e  M 


Recall  that  the  optimal  expected  cost-to-go  V^tx^J^j)  has  m^(j) 
pieces,  with  joining  points  6^(1)<  S^(2)<...<  5^(m^(j)-l). 

For  x^  <  6^(1)  and  Xj^  >  5^ (m^ ( j ) -1)  ,  the  form  transition  proba¬ 
bilities  will  not  change  from  time  k+1  until  time  N,(ie,we  will  stay 


in  an  extreme  piece  of  each  p(j,isx))  because  the  optimal 


controller  will  not  drive  x  past  (1)  or  v ^  (v^-1) ,  respectively 


(for  any  i ,%  accessible  from  j)  at  any  future  time. 

Between  6^(1)  and  6^(m^(j)-l),  the  optimal  controller  will  drive 


x  into  a  different  probability  pieceat  some  time  k+2,  . ..,N.  We  define 
the  switching  region  of  the  controller  from  r^=j  to  be  these  x^ 

values 


‘  {V  akll,<  *k  < 


(6.34) 


as  shown  in  figure  6.1. 

As  we  will  see,  the  behavior  of  the  optimal  controller  and  correspon¬ 


ding  state  trajectories  stating  from  x^  €  can  involve  one  of 


several  phenomena.  Specifically,  for  ^  values  close  to  zero  the  optimal 
controller  will  keep  future  x's  in  the  same  probability  piece  as  it 
drives  to  zero.  No  active  hedging  is  involved  in  these  "middle  pieces" 


on  either  side  of  zero,  as  we  will  see  in  the  next  section.  Outside  of 


this  region  (but  in  SJ)  the  state  will  switch  probability  regions.  However 

k 


this  can  occur  in  distinctly  different  ways  (involving  hedging  to  points, 
regions  of  avoidance  and  other  types  of  behavior) .  We  will  characterize 
these  types  of  controller  behaviors  later  in  this  chapter. 

Clearly  for  finite  times  (N-k)<  «°,  the  switching  region  has 
finite  width,  for  each  j  €  M: 


lSk>  "  6k(Vj)‘1)_6k(1)<  « 


(6.35) 


^■From  the  piece  that  x^+i  is  in  . 


8Jk(mk(j)-1)  X* 


FIGURE  6.1:  Endpieces  and  Switching  Region  for  (xfc , rk=j ) . 
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Note  that  we  have  not  yet  characterized  the  values  of  OjAl)  and 
6k(mk(j)"1K 

Now  consider  the  infinite  time  version  of  the  problem,  where  we 
wish  to  minimize 


E  A  ‘\R<v*viQ<vin+ 

(N-k0)-»«  /  k=kQ 

subject  to  (5.1)-(5.4)  and  (5.6). 


We  consider  the  existence  of  the  limiting  functions 

v£®(x,j)  =  lim  (x  =x,r=.j) 

(N-k)  -+°° 


V^(x,j)  = 


lim  VK  (x.  =x,r  =j )  . 
(N-k)  * 


Since  the  endpiece  costs-to-go  are  obtained  in  Proposition  6.1  by 
equations  which  correspond  to  x- independent  JLQ  problems,  Proposition 
3.2  gives  necessary  and  sufficient  conditions  for  there  to  exist 
steady-state  endpieces  to  the  expected  costs-to-go  and  control  laws 
in  Proposition  6.1,  as  (N-k)  grows  large.  We  have  directly  the 
following : 

Proposition  6.2:  Consider  the  JLQ  problem  of  Proposition  5.1  where 


(6.1) -(6. 4)  hold.  Then  if  we  take 


then  conditions  ( 1) — {3 )  of  Proposition  3.2  are  necessary  and  sufficient 


for  the  solution  of  the  coupled  difference  equations  (6.13)- (6.18) , 

(6. 25) - (6. 27)  to  converge  to  a  unique  constant  set  of  nonnegative  steady 
state  values  {K^®(j)>^  0,  j  €  m}  as  (N-k)-*30,  given  by  the  M  coupled 
algebraic  equations 

a2(j)R(j)[  I  X  (1)  [K^e(i)+Q(i)]J 
~  i ec.  J 

K„  (j) - - - 1 - U -  (6*37) 

R( j)+D  ( j)  [  l  X..  (1)  [KTe(i)+Q(i)]] 

iec.  31 
3 

for  j  €  M,  with  the  optimal  steady- state  left  endpieces 

V^e(x,j)  =  xVf(j)  .  (6.38) 

The  steady-state  left  endpiece  control  laws  are 

u^Urj)  *  -L^6 ( j ) x  j  6  M  (6.39) 

where  the  time  invariant  gains  are  given  by 


b(  j) 

a ( j ) R( j ) 


Similarly,  if  we  take 


p-h  = 


for  all  i,j  €  M, 


(6.40) 


then  conditions  ( 1)  —  ( 3 )  of  Proposition  3.2  axe  necessary  and  sufficient 

for  the  solution  of  (6.19)- (6.27)  to  converge  to  a  set  of  unique  finite 

Rfi 

constant  nonnegative  steady-state  values  {K^  (j)>_  0,  j€M)  as  ( N-k )-*» 
given  by  the  M  coupled  algebraic  equations 


a2  ( j)  R(  j )  I  X  X.,  (V..)  [K“(i)+Q(i)]] 


Re,., 


iec. 
_ L 


R(j)+b2 (j) I  l  X..(V..)[K*e(i)+Q(i)]] 
iec.  3  3 

3 

for  j  6  M,  with  the  optimal  steady- state  right  endpieces 


(6.41) 


V*e(x,j)  «  xV®{j> 


(6.42) 


(x,  j) 


(j)x 


(6.43) 


where 


(j) 


b(j) 

a( j) R( j) 


□ 


(6.44) 


Since  we  are  considering  a  scalar  x  problem,  if  the  dynamics  in 
each  form  are  stabilizable  then  the  expected  costs-to-go  (from  each 
form)  will  remain  finite  as  (N-k)-*“.  Stabilizability  is  trivial  to 
check  for  scalar  systems:  b(j)^0  or  | a ( j ) | <1  is  required,  for  each 
j€M.  If  any  absorbing  form  j  (ie.,  Pjj=l)  is  not  stabilizable  then 
the  expected  costs-to-go  becomes  infinite  for  all  forms  from  which 


j  is  accessible. 


If  any  nonabsorbing  form  j  (ie.. 


P..<1)  is  not  stabilizable 
3  J 


la 


r 


Pi 


K: 


E 


% 

v:'- 


fc> 


then  the  existence  of  steady-state  endpieces  (and  costs-to-go)  depends 
upon  the  dynamics  of,  and  transition  probabilities  to,  all  forms  ac¬ 
cessible  from  j .  The  existence  of  unstabilizable  nonabsorbing  forms  is 
not  out  of  the  realm  of  possibility  in  failure  prone  systems.  For 
example  such  a  form  might  represent  the  temporary  loss  of  an  actuator 
until  it  is  repaired.  The  existence  of  finite  steady-state  endpieces  for 
systems  having  these  forms  is  characterized  by  the  necessary  and  sufficient 
conditions  of  Proposition  3.2,  which  reduce  to  the  following: 


There  exist  constants  F.  for  all  i£M  such  that: 

i  — 


(1)  For  each  closed  communicating  class  C..  (having  two  or  more 
-  D 


members) ,  there  exists  a  set  of  finite,  positive  scalars 


{Z^» . . . , Z j  c  | }  satisfying  the  coupled  equations 


Z.  =  (1-p..)  1  pfc. 1(a.-b.F. ) 

1  11  11  1  11 


2t 


Q.  +  F.R. 
1  11 


y  Pu  z 

vk.  ^ii  * 


for  all  i€C . . 

3 
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(2)  There  existsa  set  of  finite  positive  scalars  {G, ,  ...,G  } 

IT 


satisfying  the  coupled  equations 


,  2fe  /  Q-+F.R- 

G,  =  (1-p..)  I  pt:1(a.-b.F.  )2t/  1  1  1 

i  11  t=l  11  1  1  1  /  + 


7  lii  G 

Vi 


(for  all  i  6  T  C  M;  T  is  the  subset  of  transient  forms  in  M) . 


(3)  all  absorbing  forms  are  stabilizable. 


The  reason  that  these  conditions  are  so  complex  is  that  the  controller 
must  account  for  an  extremely  wide  range  of  possible  behaviors.  For 
example,  it  is  not  enough  that  the  system  will  eventually  enter  a  sta¬ 
bilizable  state  with  probability  one,  as  we  will  see  in  example  6.2. 

When  the  only  unstabilizable  forms  are  transient  forms  ji  e  T jthat  are 
not  accessible  from  any  form  in  their  covers  i  6  c,  except  themselves 
(that  is,  once  we  leave  i  we  can't  return),  then  corollary  3.4  yields 


a  sufficient  condition  for  the  existence  of  steady-state  endpiece  cost 


functions  that  is  easier  to  test  than  (l)-(2)  above: 


Let  us  now  consider  the  growth  of  the  switching  regions 


isjl  -  < 

as  (N-k)  grows  large.  If  this  quantity  were  to  converge  to  a  finite 
value  as  (N-k)-**3  it  would  mean  that  for  x^  negative  enough  (or 
positive  enough) ,  the  optimal  controller  does  not  make  use  of  the 
knowledge  that  the  p(j,i)  can  be  changed  by  active  hedging .  This 
situation  will  obviously  arise  if  none  of  the  form  transitions  that 
the  system  can  make  once  it  is  in  j  are  x-dependent  (since  active 
hedging  will  be  of  no  use) .  Finite  switching  regions  also  arise  when 
the  system  cannot  be  susceptible  to  any  x-dependent  p(i,Z)  more 
than  once,  after  it  has  entered  j.  In  general  however,  the  switching 
regions  grow  in  width  without  bound  as  (N-k)-*00 


Proposition  6.3  (Growth  of  Switching  Regions) 


Consider  the  JLQ  problem  of  Proposition  5.1.  For  each  form  j 


(i)  If,  once  the  system  is  in  form  r^  *  j  ,  all  of  the 
form  transitions  that  t'  a  system  can  make  are 
x- independent  then  V^Cx^r^  *  j)  has  one  piece 


*  1 


all  k 


(ii)  If,  once  the  system  is  in  form,  r,  =  j,  all 

K 

of  the  form  transitions  that  the  system  can 
make  are  x-independent  except  for  (at  least) one 
that  has  a  single  transition  probability  dis¬ 
continuity  at  x  =  0,  then  V  (x^r  =  j) 
has  two  pieces 

m^j)  =  z 

with 

5.(j)  =  0  (joined  at  zero) 

and 

i4i  -  0 

as  (N-k)-«°  . 

(iii)  Assume  that  each  form  has  stabilizable  dynamics. 

If,  for  r  =  j,  the  system  cannot  (from  time  k+1 

A 

to  N)  be  in  any  form  having  x-dependent  exit  pro¬ 
babilities  for  more  than  one  time  step  then 

|Sj|  will  remain  finite  as  (N-k)-*»  . 

Note  that  the  system  must  have  p(j,j)  =  0  if  p ( j , it  is 
x-dependent  for  any  i  S  . 

If  one  or  more  of  the  form  accessible  from  j  is  not 
stabilizable  then  iS-M-wo  as  (N-k)-*» 


(iv)  if  for  rk  =  j  it  is  possible  to  repeat  an 

x-dependent  from  transition  (from,  j  or  from  any  i 
accessible  form  j,  including  possibly  p(j,j)) 
with  transition  probability  discontinuities 
not  all  at  zero,  then  [ j-*»  as  (N-k)-*»  . 


Proof;  (Sketch)  : 

Parts  (i)  and  (ii)  are  obvious.  For  part  (iii) ,  since  there 

are  only  finitely  many  forms  then  after  a  finite  number  of  times 

(say  m)  the  system  will  have  entered  a  stabilizable  form  i  that 

satisfies  part  (i)  or  part  (ii)  .  Thus  as  (N-k)-**3,  since 

IS,1!  =  0  we  have  IS?  — |  finite.  For  parts  (iii)  and  (iv)  ,  if 

1  k 1  1  k-m 1 

one  or  more  of  the  forms  accessible  from  j  is  not  stabilizable 
then  jS^|-Kn  since  the  expected  cost-to-go  in  this  form  becomes 
infinite  as  (N-k)-*»  . 

In  (iv)  if  all  of  the  forms  are  stabilizable  then  the  ability 
to  repeat  an  x-dependent  transition  makes  |S^|  grow  without  bound. 
The  basic  idea  is  as  follows:  Since  each  form  j  is  stabilizable 
we  have  by  Proposition  6.2  that  the  steady-state  endpieces 
exist.  The  closed-loop  optimal  gain  in  the  left  endpiece  becomes 


arbitrarily  close  to: 


V 


a(  j)R(D) 


(6.46) 


(6.47) 


as  (N-k)-*50  .  This  limiting  value  of  the  closed  loop  optimal  gain  must 
be  stable  if  the  steady-state  endpiece  cost  functions  of  Proposition 
6.2  are  to  be  finite.  That  is,  we  must  have 


a(j,  fr,,, 


(6.48) 


In  appendix  C.6  we  show  that  the  condition  in  (iv)  and  (6.48)  make 


as  (N-k)-*»  . 


T  g 

The  steady-state  endpiece  functions  (x,j)  and  (x,j)  are 
useful  in  describing  the  asymptotic  behavior  of  the  optimal  JLQ  controller 
even  though  the  switching  region  between  the  endpieces  becomes  ar¬ 
bitrarily  large  (in  general)  as  (N-k)-*°°.  In  particular,  they  are  useful 
in  "finite-look  ahead"  approximations  of  the  steady-state  controller 
which  will  be  discussed  in  Chapter  7. 

This  completes  on  discussion  the  endpieces  of  V^(x^,r^=l)  and 

u^(x^,r^=j) .  Several  examples  will  be  presented  at  the  end  of  the 
next  section  of  this  chapter. 


6 . 3  Middlepieces  of  the 


tiraal  Controller 


T- r 


K- 


u 

fes 


TLQ  OP 


In  this  section  we  consider  the  behavior  of  the  optimal  JLQ  con¬ 
troller  of  Proposition  5.1  near  the  origin,  when  the  x-costs  are 
simple  quadratics  (i.e.,  when  (6.1) -{6. 4)  hold).  That  is,  we  examine 
here  the  middle  pieces^-  of  (x^r^sj)  and  u^  j )  for  each  j  €  M: 


LM,  .. 

\  (V3) 

\  ‘V31 


(6.50) 


v; 


RM, 

k  ‘V3’ 


^‘v31 


for 


*k  °  1  \  -  °k 


<  6? 


(6.51) 


where 


6^  =  max  {6^(1)<  0} 
-k  k 


3^  =  min  {6;j(£)>  0} 

K  K 

where  1=0 , 1 , . . . , m^ ( j ) 


As  we  will  see,  if  there  are  no  form  transition  probability 
discontinuities  at  zero 


(6.52) 


(6.53) 


Vji(t)?*0  t=l, ... , Vj^l  for  i,  j  SM 


I* 


l  - 


^Xhe  superscript  "LM"  and  "RM"  denote  "left  middlepiece"  and  "right 
middlepiece"  respectively. 


then  the  left  and  right  middle  pieces  in  Proposition  6.6  are  given  by 
the  same  equations.  That  is,  there  is  a  single  middlepiece  valid  in 


given  by 


t , RM t  _ _ 

k  (VJ)  = 


at  each  time  k  and  in  each  form  j.  The  basic  results  of  this  section 
are  as  follows: 


(1)  for  finite  time  horizon  problems,  if  x^  is  close 

enough  to  zero  the  optimal  controller  keeps  xlc+i , .  .  .  , 
^  in  the  same  close-to-zero  pieces  of  the  transition 
probabilities  p  (j,i:x)  (and  x  is  driven  to  zero). 

The  controls 


Vi-'Vi 

do  not  actively  hedge  ( i . e .  ,  don ’ t  change  form 

probability  pieces)  from  the  close-to-zero  piece  that  x^^.^ 
is  in) because  there  is  no  advantage  in  doing  so.  The  best 
strategy  for  these  x  close  to  zero  is  just  to  go  to  zero. 

As  with  the  endpieces,  the  middlepieces  correspond 
to  x- independent  JLQ  control  problems.  The  middle 
pieces  of  Vk (x^ ,rk=j )  and  u^  (xk,rk=j)  can  be  computed 
via  sets  of  M  coupled  recursive  difference  equations. 


(2)  For  infinite  time-horizon  problems,  as  (N-k)-*30 
these  middlepieces  converge  to  steady-state 
(constant  parameter)  functions  of  x  if  the 
dynamics  are  stabilizable. 

(3)  At  all  times,  the  widths  of  the  middlepieces  are 
finite  (except  when  a  middlepiece  and  endpiece 
are  the  same  at  all  times  for  some  form  j,  because 
there  are  no  form  transition  probability  disconti¬ 
nuities  on  one  side  of  zero  for  any  form  ac¬ 
cessible  from  j ) . 

The  above  results  are  obtained  in  Propositions  6.4,  6.6  and  6.5 
pectively.  We  first  have  the  followings 


Proposition  6.4:  (Middlepieces) 

Consider  the  JLQ  problem  of  Proposition  5.1,  where  (6.1)- (6.4) 

hold. 


(1)  For  6^  <_  x^  <_  0  the  optimal  expected  costs-to-go 
and  control  laws  are 

vwjl  -  vT(*k'j> 

4 

vwj)  ■  \"(vjl 


(6. 


(2)  For  0  £  £  6^  the  optimal  expected  costs-to-go 

and  control  laws  are 


Vvvj>  ■  ''T'v1’ 


A  2  RM  ,  v 

=  Vk  (3) 


(6 


RM. 


WVj)  a  \  (Vj) 

A  TRM... 

'  -h.  (3>*k 


(6 


(3)  The  parameters  in  (6. 54) - (6. 57)  are  computed  recursively, 
backwards  in  time  from  N  by  (6- 13) - (6. 31)  where,  for  each 
i, j  e  M  we  make  the  substitutions 
LM  replaces  Le 

X^(i)  is  replaced  by  (LM) ,  the  value  of 
p(j,i,x)  valid  for  x  €(max{v_.^<0},0) 

RM  replaces  Re 

A.,(v..)  is  replaced  by  A..(RM),  the  value  of 
]l  )i  3* 

p(j,i,x)  valid  for  x  e(0,  min{v^>0}) . 

Proof  (sketch) : 

This  proposition  is  a  restatement  of  Proposition  3.1,  where 


for  left 
middlepieces : 


P..  =  A.,  valid  in  (v..,0] 
31  31  ~Di 


for  right 
middlepieces : 


P,.  =  A.,  valid  in  [0,v..) 
]i  3i  “31 


.56) 


.57) 


where 


A 

v. . 

-31 

max{v . . (Z) <0} 

3i 

(6.58) 

A 

V  .  .  * 

31 

min{Vji(Jl)  ,0} 

* 

for  all  x,j  e  m  . 

(6.59) 

We  have  only  the  x^  turn  in  (6.54) ,  (6.56)  because  of  (6.1)  -  (6.2) . 
Consider  figure  6.2.  We  see  that  there  are  two  switching  regions 


„jL 

*  left  switching  region  S  k 


•  right  switching  region  S 


jk 


which,  together  with  the  middle  pieces,  constitute  the  switching 
region  of  figure  6.1. 

For  ^  e(6^  ,  0)  and  xk  S  (0,  Z^)  the  form  transition  pro¬ 
babilities  will  not  change  from  time  k+1  until  N  because  the  optimal 
controller  is  (atxk+1)  in  the  probability  piece  that  contains  (or  is 
bounded  by)  zero.  That  is,  for  these  x^  values  the  controller  does 
not  actively  hedge  with  uk+i , . . ,uN_i .  The  following  proposition  char¬ 
acterizes  the  values  of  5?  and  3? 

-k  k 


Proposition  6.5;  Consider  the  middlepieces  of  Proposition  6.4. 

(1)  If  there  is  no  form  transition  probability  discontinuity 

to  the  right  of  zero  for  any  p(j,i)  (Vi  e  Cj  and  for  any 

p(£,t)  (v£  accessible  from  j)  then 

V,  (x,  ,r.  =  j)  has  only  one  piece  for  x.  >  0. 
k  k  k  x  — 
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Tha,t  is,  right  middlepiece  vj^*(j)  extends  to  +»  ; 

P  =,  oo  by  (6.53)  . 


Similarly  if  there  is  no  form  transition  probability 
discontinuity  to  the  left  of  zero  for  any  p(j,i) 
(Vie  C.)  and  for  any  p(2,t)  (¥2,  accessible  from  j) 
then 


V^Cx^jr^  =  j)  has  only  one  piece  for  x^  0. 


That  is,  left  middlepiece  (j)  extends  to  -00; 

6^  =  -oo  by  (6.52)  . 


Now  suppose  (1)  does  not  hold.  Let 


$  .  =  miniv  .  .  (t)  >0 

:  (  31 


i  e  C. 

3 

t  =  1, .  .  .  ,  V  .  .  -  1 
31 


Then  at  each  time  k  =  N-l,  N-2,...,k  s 

o 


o  <  fi  + 

—  k  —  a  (3)  L  R  ( 3 ) 


5l  <*] 


(6.60) 


In  addition  the  {<5^  |k  =  N-l,N-2, . . . } 


are  related  as  follows 


(4)  Now  suppose  (2)  doesn't  hold.  Let 


a . 
3 


A 

=  max 

c 

ft 

A 

O 

i  e  c. 

3 

I 

1  31 

t  =  l, . 

..,  V.  ,-li 

31  ’ 

Then  at  each  time  k  =  N-l ,  N-2 , . , . ,  k  : 

o 


a . 

a(  j) 


1  + 


b2(j) 

R(j) 


^LM  , 


<  5r  <  o 

—  — k  — 


* 


In  addition  the  [_5?  |k  =  N-l, N-2,.-]  are  related 

as  follows,  if  j  e  C.i 
J  3 


(6 


-k+l 


a(  j) 


b^(i)  ~LM 

1+  ^^i<3> 


R(j) 


<  6J  < 

-  k  - 


(6 


The  proof  of  this  proposition  appears  in  appendix  C.7. 

It  is  obtained  by  direct  calculation  from  the  optimal  closed 
loop  dynamics  in  the  middlepieces,  as  specified  by 


(6.55),  (6.57). 


We  now  consider  the  existence  of  steady-state  middle  pieces 


V^“(x,j)  =  lim  (x*x,  r  »j) 

(N-k)  -*»  *  *  * 


vf(x,j)  =  lim  V^(x=x,  r  =j) 

(N-k)->®  *  *  * 

for  the  infinite  time  horizon  problem. 

As  in  Proposition  6.2,  we  have  directly  the  following: 

Proposition  6.6:  For  the  problem  of  Proposition  5.1  where  (6.1) -(6. 4) 


hold,  if  we  take 


FIGURE  6.2:  Switching  regions,  endpieces  and  middlepieces 
of  V.  (x_  ,  r.  =>i ) . 


or 


P  .  .  =  X. .  (LM) 

Di  31 


for  all  i, j  €  M 


P. . 


X .  (RM) 


then  conditions  ( 1)  —  ( 3 )  of  Proposition  3.2  are  necessary  and  sufficient 


for  the  solution  of  the  coupled  difference  equations  of  Proposition  6.4 

to  converge  to  the  unique  constant  sets  of  steady-state  values  : 

0  ,  j  e  M]  for  left 

middlepieces 


{Koo  <j)l  0  ,  j  e  m} 


for  right 
middlepieces 


as  (N-k)-*-  00 ,  given  by  the  solutions  of  the  sets  of  M  coupled  algebraic 
equations 


LM 

m'3  )  - 


iec. 


31 


a2  ( j )  R(  j )  [J  X..  (LM)  [*“(!) +«(i)]] 


J. 


R(j)+b  (j)  [  l  X  (LM) [Kjf (i)+Q(i)]] 

iec .  3 

3 


a2  (j)  R(  j)  t  X  X..  (RM)  [K£"(i)+Q(i)]] 

iec  31 

^(j)  =  - - 3 - 


.RM, . 


R(j)+b2(j)[£  X.. (RM) [K*"(i)+e(i)]J 

<  BA  J  1 


RM, 


iec 


with 


LM 

voo  (*»j: 

1  =  xVf(j) 

„RM 

V»  (*r3- 

)  =  x2K^(j) 

j  e  m. 


(6.6' 


(6.6! 


(6 . 6< 


The  steady-state  middlepieces  of  the  optimal  control  laws  are 


C  (j)  =  C  (j)/a(j) 


RM  RM 

C(j)  «  (j)/a(j) 


(6. 


These  middlepieces  are  valid 
for  ^(j):  0  <  x  <5^ 

for  vf(j):  §l<x<0 


(6. 


where,  if  (6.60),  (6.62)  hold: 


(6. 


As  with  the  endpieces  we  have  that  if  each  form  is  stabilizable  then 
(by  Corollary  3.5)  these  conditions  are  satisfied  and  the  steady-state 
middle  pieces  exist.  And  for  transient  forms  that  the  system  does  not 
return  to  after  leaving,  we  can  relax  this  stabilizability  requirement 
to 


by  Corollary  3.4. 


i 


Example  6.1  (Example  5.1  Revisited) 

From  proposition  6.1  we  can  compute  the  endpieces  of  V  (x  ,r  =1) 

JC  K  K 

and  (x^,rk=l)  recursively.  We  find  that 


^'W11  ‘  ''T'W11  ■  \  ^e<1> 

^‘W1'  -  \  'VV11  -  <(1)\ 

where 

*?<»  -  *>>-° 

K^u, .  <•(„  -  ill  feiiw* 

2  *  7  ^iai  +  4  W2> 

-  ^a,  -  Lfa. 


^‘W11  a"dvflWl)  core  the  same  in  this  example  because  of 
the  symmetry  (about  zero)  of  the  form  transition  probabilities. 

From  Proposition  6.6  we  get  the  middlepieces 


vr<*k'11  -  'C'v11  -  \  i^ai 


where 


*a>  -  <"  a,  -  1  *  1 

2  +  !  C‘1,+  i  Vi'21 


The  values  of  these  endpiece  and  middlepiece  parameters  are  listed 
for  several  time  stages  in  table  6.1. 


k 

Endpieces 

LM  RM  LM  RM 

K^M(1)=kJM(1)=L^  (1)=lJ"(1) 

Middlepieces 
(1)  (1)  (1)  =k££ 

N 

N-l 

.7647058 

.636363 

N-2 

.7821909  1 

.694868 

N-3 

.7834843 

.6995943 

i  i 

.7836889  .7000659 


TABLE  6,1:  Middlepiece  and  endpieces  for  Example  6.1 

Since  b{l)^0,  these  parameters  quickly  converge  as  (N-k)  increases  to 
the  steady-state  values 
RM  LM 

K  (1)  =  K  (1)  «  .7836889 

00  CO 

KLefl)  =  KRe(l)  a 


(1) 


.7000659 


The  following  two  examples  further  illustrate  the  qualitative 


properties  of  these  middlepiece  and  endpiece  cost  functions.  In 
particular,  example  6.3  demonstrates  that  the  endpiece  and  middlepiece 
functions  can  become  infinite  as  (N-k)-*=°  even  though  the  cost-to-go 
is,  in  fact,  finite  with  probability  one. 


Example  6.2: 

We  consider  the  following  system: 


Vi  =  2xk  if  rk=1 

*k+l  =  2  \  +  \  if  rk=2 


P (1, 2 :x) 


0 


A=l/2 


if  | x | <1 
if  | x | >1 


P  (2, 1  :x) 


1 

1/2 


if  | x | <1 
if  |xj>l 


The  form  structure  and  transition  probabilities  P(l,2:x)  and 
P (2 , 1 :x)  are  illustrated  in  Figure  6.3. 

We  seek  to  minimize 

NrX  2  2,  x  2 

mn  .  I  (Vi  *  V  +  XN  • 


Let  us  consider  some  qualitative  properties  of  the  optimal  expected 
costs-to-go  V^(x^,r^=l)  and  V^{x^,r^=2).  In  form  r=2  the  system  is 
controllable.  Thus  we  know  that  f>ounde<*  (for  any  finite 

x)  since  the  control 


-a(2) 

\  "  b ( 2 )  Xk 


will  drive  x^+1 


to  zero  with  a  (nonoptimal)  cost  of 


a2  (2) 
b2(2) 


Note  that  the  system  is  not  stabilizable  in  form  1.  Thus  the  value 
of  | x |  will  double  and  a  cost  of  x^+1  will  be  changed  at  each  succeding 
time,  until  the  system  jumps  into  form  r=2.  Once  it  get  into  r=2,  the 
expected  cost-to-go  is  finite. 

Since  p{l,2)>  0  for  |xj>l  it  is  clear  that  the  optimal  cost  will 
be  finite  with  probability  one  as  (N-k)-*».  As  we  will  see,  this  does 
not  guarantee  that  the  expected  cost-to-go  remain 

finite,  however.  That  is,  the  convergence  of  the  cost-to-go  with 
probability  one  does  not  imply  that  the  controlled  system  is  moment 
stable. 


From  Proposition  6.1  we  have  that 


^(1)  =  ^(1)  -4[l+I  1^,(1)  +  2  ^k+]_  ( 2  )J 


Le 

\  <2> 


<■«« 


jWj  O11* ?CI2). 

i+[i+K:i<i)+K:i(2,i 

K^e(2)  =  1  . 


From  the  first  of  these  equations  we  can  verify  that  there  is  no 

Le 

finite  positive  steady-state  value  K  (1).  If  the  steady-state  values 

KLe(l),  KLe(2)  were  to  both  exist  then  they  would  have  to  satisfy 
00  00 


K^d) 

CO 


hence 


K^d) 


4  +  2KLe(l)  +  2KLe<2) 

CO  oo 


-4-2KLS(2) 

OO 


For  any  KLe(2)>  0  (which  must  be  the  case),  KLe(l)<  -4.  Thus 

00  —  oo 

Le 

we  see  that  K  (1)  grows  without  bound  as  (N-k)-*50. 

00 

However, 


Le 

K  (2) 
00 


Therefore  as  (N-k)-*»  and  K^+1  (!)-*»  we  have 


L  4  2 

lim  it  (2)  =  —  —  =  1/4  - 


That  is. 


Dp 

Kco  <2)  -  K=o  (2)  -  1/4  - 


From  Proposition  6.4  we  have 


LM  RM  LM 

\  (1)  (1)  =4[1+K^1(1)] 


where 


kJT<2)  =KT(2)  -T  £1+C(1)] 


1+C1+Kk?l(ln 


LM 

*k  (1) 


LM 

*k  (2) 


=  1 


Note  that  these  middle  piece  costs  are  not  coupled.  This  is  because 
the  middlepieces  are  valid  only  in  a  region  contained  inside  the 
interval  (-1,1),  in  which  form  r=l  is  an  absorbing  form. 

From  the  above  we  see  that  as  (N-k)-*», 


LM  RM 

kJ~(1)  a  Kjpl) 


become  infinite  and 


(2)  =  K®“(2)  — ►  1/4  =  KLM(2)  =  Km(2)  . 

K  oo  oo 

The  values  of  the  quantities  described  above  are  computed  for  four 
time  steps  in  Table  6.2. 


k 

K^e(l)=K^e(1) 

K^(2)=K^e(2) 

LM  RM 

\  (1)=\ 

LM 

(1)  \  (2) 

N 

1 

1 

1 

1 

N-l 

8 

.1666666 

3 

.166666 

N-2 

20.333333 

.2089041 

36 

.225 

N-3 

45.084474 

.229269 

148 

.243421 

N-4 

94.628202 

.2398609 

596 

.248333 

* 

00 

♦ 

1/4 

* 

00 

1 

1/4 

TABLE  6.2;  Middlepiece  and  Endpieces  for  Example  6.2 


Note  that  for  both  the  middlepieces  and  endpieces  in  form  1,  the 
sufficient  condition  for  finite  steady-state  costs  of  Corollary  3.4 
is  not  met.  That  is, 

Pua2U>  >  1 


249 


This  is  illustrated  in  figure  6.4.  However  the  cost-to-go 


vk (VV1)  is  finite  with  probability  one. 

In  the  next  example  we  let  p(l,2:x)  for  [ x | >1  be  a  parameter. 
If  the  probability  of  switching  from  r=l  to  r=2  is  high  enough,  the 
endpieces  of  V^Cx^/r^l)  remain  finite  as  (N-k)  increases  but  the 
middlepiece  of  (x^, rk=l)  still  blows  up. 

Example  6.3:  We  generalize  the  previous  example  by  considering 

arbitrairy  \  values: 


p(l, 2 :x) 


|x|>  1 
(x|<  1 


Then 


K£e(l)  =  Kje(l)  =  4[1+(1-X)K^1(1)+  XKj^(2)] 


Le  Re  I  [1+  2  KSl(1)+  1  Kk+1(2)] 

K^(2)  =  K^(2) - - 


with  1^(1)  -  *£”(1)  and  *£”(2)  =  k£“(2)  taking  the  same  values 


LM  .  _  _  „RM, 


as  in  example  6.2. 


•  2 


for  middle  pieces : 


P^af‘4  b(2) s  1  #0 

b(1)  =  0 


for  end  pieces: 


Pit  of  =2  b(2)  =  1*0 

b{1)  s0 


figure  6.4:  Form  structures  applicable  for  endpieces  and 
middlepieces  in  examples  6.2  fund  6.3. 


From  Figure  6.4  we  see  that  the  sufficient  conditions  for  the 


existence  of  steady-state  middlepieces  are  not  satisfied  in  form  1 
2 

since  p^a^  =  4>1.  But  if  3/4  <X£1  then  the  sufficient  condition 
for  the  existence  of  steady-state  endpieces  in  Corollary  3.4  is 
satisfied: 


PlxaJ  =  (1-X)  4<1  3/4  <X<1  . 


(1) 


Le  R 

When  this  holds  we  find  that  for  the  endpieces,  (1)  - 
converge  to  a  finite  positive  steady-state  value  (as  do 
K^e(2)  =  K^e(2)  and  K^M(2)  =  K^M(2))  even  though  the  middlepiece  in 


r=l  has  infinite  steady-state  cost .  The  steady-state  values  of  the 
endpieces  of  V^CXfr-j)  are  given  by 


-  (56X-29)+\  /(56X-29)  +4(8X-2)  (32X-12) 


(2)  *  K^(2)  = 


2  (32X-12) 


Le  Re  4  U*  **£*«>> 

’  KJ <» - 4V3 - 


For  example,  take  X=7/8.  Then  we  cam  compute  the  values  shown  in 
Table  6.3,  and  from  the  above  we  have  that 


K^®(2)  =  K*®(2)  =  —  =  .2135254 

00  00  Q 


K^®(1;  =  K*®(1)  = 


9.4946778  . 


\  (1)-\  (1) 


Examples  6. 1-6. 3  illustrate  some  of  the  diverse  behaviors  that 


the  endpieces  and  middlepieces  can  exhibit  as  (N-k)-*».  These  behaviors 
are  directly  related  to  the  expected  behavior  of  the  controlled 
x-process,  and  to  the  qualitative  properties  of  the  entire  expected 
cost-to-go  V (x ,r  ).  As  we  saw  in  example  6.2,  for  very  simple  examples 

K  K  K 

we  can  get  phenomena  such  as  finite  cost — to-go  w.p.l  but  infinite 
expected  cost. 

This  concludes  our  discussion  of  the  middlepieces  of  V]C^xk,rk=^ 

and  u^lx^r  =j)  for  the  JLQ  control  problems  of  section  6.1.  We 

have  thus  far  characterized  the  behavior  of  V,  (x,  , r,  =j )  and 

k  k  k 

(x^,  r^j )  over  extreme  values  of  x  (far  from  zero)  and  for  x  near 
zero.  In  particular,  we  have  obtained  a  description  of  the  steady- 
state  behavior  of  these  endpieces  and  middlepieces  in  turns  of  cor¬ 
responding  x-independent  JLQ  problems  of  Chapter  3.  In  the  next  section 
we  consider  the  behavior  of  the  controller  over  the  switching  regions 
of  Figure  6.2,  between  the  endpieces  and  middlepieces. 

6.4  Bounds  on  the  expected  costs-to-go 

In  this  section  we  continue  or  examination  of  the  steady-state 
properties  of  the  scalar,  x-dependent  JLQ  controller.  We  are  concerned 
with  the  nature  of  the  expected  costs-to-go  V  (x  ,  r  )  between  the  end— 

X  X  Jx 

pieces  and  middlepiece  (ie,in  the  switching  regions  of  fig.  6.2).  We 


develop  upper  and  lower  bounds  on  V^ix^/r^)  here  that  correspond 
to  x- independent  JLQ  problems.  Thus  bounds  can  be  computed  off  line 
via  recursive  difference  equations  and,  using  the  results  of  Chapter  3 
we  have  necessary  and  sufficient  conditions  for  these  bounds  to 
converge  to  finite  values  as  the  time  horizon  becomes  infinite. 

We  motivate  our  derivation  of  these  bounds  by  the  following 
example  which  demonstrates  that  the  cost-to-go,  if  we  stay  with 
certainty  in  the  "most  expensive  form",  is  not  always  an  upper  bound 
on  V  (x  ,  r  =j)  and  the  cost-to-go,  if  we  stay  with  certainty  in  the 

JC  JC  Jv 

"least  expensive  form",  is  not  necessarily  a  lower  bound  on 


VWj)- 


p(l,l)-0 


p (2, 2) =0 


Q (1) =1  R(l) =100 

Q(2) =100  R(2) =1 

That  is,  the  dynamics  in  each  form  are  the  same,  and 

.  in  form  1  the  control  cost  is  high 
.  in  form  2  the  control  cost  is  low. 

The  solution  to  the  LQ  problem  corresponding  to  staying  in 
form  1  for  all  times  (that  is,  with  Q=l,  R=1QQ)  yields 


V, 


(1) 


2  „(1) 

x,  K, 

k  k 


where 


c(1) 

N 


=  0 


(1) 


100(Kk+i  +1) 

100+  (K^  +1) 


100  K(1!  +100 
_ k+1 _ 

ioi+ 


The  solution  to  the  LQ  problem  corresponding  to  staying  in  form  2 
for  all  times  (that  is,  with  Q=100,  R=l)  yields 


The  solution  to  the  x-independent  JLQ  problem  (by  Proposition 
3.1)  yields 

vw11  -*£**<«  i-1-2 


where 


V11  ’  V2)  -  0 


_  100(100  «k+lU>>  104.100  Kk+1  (2> 

100+ (1 00  +K^+1 (2) )  200+K^  (2) 

..  uViul 

MVU1  2«W(1)  ' 

All  of  the  above  costs  are  listed  for  four  time  steps  in  Table  6.4. 


always  in 

"cheap" 

form 

Q(2)=100 
time  R(2)=l 


always  in 
"expensive" 
form 
Q (1) =1 
R(l) =100 


optimal  solutions 
to  flip-flop 
problem 


k 

<2) 

„(1) 

Vx) 

V2) 

N-l 

.990099 

. 990099 

50 

5 

N-2 

.9901951 

1.9512669 

50.124638 

.9807692 

N-3 

.9901951 

2.366664 

50.243994 

.9808152 

>1-4 

.9901951 

3. 7227189 

50.244006 

.980859 

TABLE  6.4:  Costs  for  example  6.4. 
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From  Table  6.4  we  see  that  the  optimal  costs-to-go  in  the 
flip-flop  JLQ  problem  are  not  bounded  by  the  cheap  and  expensive 


LQ  problems.  That  is 

vktVV1)>  vk"  <Vivk2l(V 


The  reasons  for  this  can  be  summarized  as  follows : 

(2) 

1.  V,  is  not  a  lower  bound  on  V,  (x,  ,  r  )  because 
k  k  k  k 


in  the  "cheap  form"  problem  the  optimal  LQ 

controller  (assuming  r^*2,  Vk)  spends  a  lot 

of  control  energy  (since  R(2)  is  only  1)  to 

avoid  the  relatively  expensive  (Q(2)=100) 

2 

cost  on  x]c+^  • 

in  the  flip-flop  problem  when  r^=2  the  con¬ 
troller  does  not  have  to  spend  as  much  energy 
since  will  be  1,  and  thus  the  lower  cost 


48(i>  -  4 

2 

will  be  charged  instead  of  x  +^Q(2) 


100 


2 

Vl‘ 


is  the  index  of  the  p(j,i:x)  piece  that  is  valid  for 


i 


Here  % 


x  e 


The  proof  of  this  proposition  is  given  in  Appendix  C.8. 

Basically  these  bounds  arise  by  taking  the  worst  case  and  best  case 
transition  probability  pieces  in  (6.78),  (6.77)  at  each  time  (for 
each  j  e  M) .  Thus  the  bounds  are  quadratic  (not  piecewise  quadratic) 
in  x^.  In  Figure  6.5  this  upper  and  lower  bound  is  illustrated  for 

am  example  problem.  Note  that  for  this  particular  example,  the 


upper  bound  amd  left  endpiece  are  the  same.  That  is, 


V^fx^/j)  ®  vUB(xk,j).  In  general  the  endpieces  need  not  be  the 


same  as  either  the  upper  or  lower  bound. 

We  now  consider  the  existence  of  steady- state  upper  and  lower 
bounds  on  the  steady- state  expected  cost-to-go: 


LB'  -  A  lim  (x,  -x,r 


V*  <x,j> 


(N-k)- 


(V"'-k 


j) 


UB  A  UB 

V  (x,j)  -  lim  V  (x-x,r*j) 

(N-k)-*»  K  *  * 


for  the  infinite  time  horizon  problem  where  (if  all  of  these  quantities 


v“(x,j)<  lim  V  (x  =x,r  -1)<  v“(x,j) 
(N-k)-*»  K  *  * 


w 


exist)  : 


We  cannot  directly  apply  the  conditions  of  Proposition  3.2  to 
Proposition  6.7  (as  we  did  for  steady-state  endpieces  in  Proposition 
6.2  and  aiddlepieces  in  Proposition  6.6)  because  the  upper  and  lower 
bound  calculations  in  (6. 75) - (6. 79)  do  not  directly  correspond  to 
til— -invariant  x- independent  pr~v ' choice  of  index  (t)  in 
(6. 77) -(6. 78)  may  change  with  k,  as  (N-k)  increases.  However  we  cam 
find  weaker  upper  and  lower  bounds  on  the  expected  costs- to-go  that  do 
correspond  to  x- independent  JLQ  problems  and  that  do  converge  as 

(N-k)  -*«. 

Proposition  6.8:  (Steady-state  Bounds) 

With 


max 

t-1, . . , 


(t) 


all  i,j  €  M 


(6.80) 


conditions  (1)  —  (3)  of  Proposition  3.2  au:e  sufficient  for  the  existence 
of  a  set  of  nonnegative  scalars 

^K(j)  £  0,  jeM  ^ 

such  that,  as  (N-k)— »<*>  we  have  for  each  jeM 

K®(j)-<K(j).  (6.81) 


263 


H  ere  {K(j) :  j  6  M}  are  the  nonnegative  solutions  of  the  set  of  M 
coupled  equations 


K(j) 


a2(j)R(j)f  l  p  (K(i)+Q(i))J 

iec^  3 

R(j)+b2(j)[  l  p  CK(i)+Q(i))] 

i€C.  31 
i 


(6.82) 


with  the  P . .  in  (6.82) 


given  by 


(6.80). 


Similarly  with 


P  . .  =  min  X . . (t)  all  i,j  6  M  (6.83) 

31  t«l,..,v..  31 

Di- 

conditions  (1) — (3)  of  Proposition  3.2  are  sufficient  for  the  existence 

of  a  set  of  nonnegative  scalars 
K Lj  1-0  j€M 

such  that,  as  (N-k)  — for  each  j  eM 
K(j)  -  K“(j) 

(6.84) 


where  {K(j) : j  €  M}  are  the  nonnegative  solutions  of  the  set  of  M 
coupled  equations 


(6.85) 


K(j)  = 


a  (j)R(j)[  l  p  (K(i)+Q(i))] 
ieCj  3 

R( j ) +b2 ( j ) I  l  P  ,i(K(i)+Q(i))] 

iec ,  3 
: 


with  the  p. .  in  (6.85)  given  by  (6.83).  Thus  as  (N-k)  — we 
have 


V^f(x,j)  -  x2  K^(j)  -  x2K(j) 

V^8 (x, j)  =  x2  K^8 < j )  -  x2K(j) 

respectively,  for  each  j  S  M. 


(6.86) 


(6.87) 

O 


The  proof  of  this  proposition  appears  in  Appendix  C.9.  These 
bounds  correspond  to  the  highest  and  lowest  possible  cost  parameters 
at  each  time  stage.  Note  that  for  problems  with  each  form  stabilizable 
the  above  conditions  are  immediately  met. 

To  summarize,  in  this  section  we  have  obtained  upper  and  lower 
bounds  on  Vj^x^r^j)  that  are  recursively  computed  with  an 
embedded  comparison  of  scalar  quantities  at  each  time  step  (in 
(6. 77) -(6.  78)). 

We  then  obtained  sufficient  conditions  for  weaker  bounds  to 


converge  to  steady-state  values  as  (N-k)-*®  ..  in  Chapter  6 


(i.e.,  when  (6. i) — (6. 4)  hold)  the  stabilizability  of  each  form  is 
then  sufficient  for  the  existence  of  steady-state  endpieces,  middle- 
pieces  and  overall  bounds  on  the  costs-to-go.  Example  6.3  shows  that 
this  is  not  a  necessary  condition. 


6.5  A  Single  Form-Transition  Problem 

In  this  section  we  formulate  a  special  class  of  JLQ  problems 
that  will  be  used  in  the  remainder  of  this  chapter  and  chapter  7 
to  illustrate  various  qualitative  properties  of  the  x-dependent  JLQ 
controller. 

We  consider  systems  with  M=2  forms: 


*k+l  ’  a(rk>*k  +  blrk>\ 
r.e{l,2) 

K 


(6.88) 


P(l,2:x) 


(A) 

1 

if 

|x|<  a 

“2 

if 

|x|>  a 

-P(l,2 

:X) 

p(2, 1) =0  p(2,2)=l 

(6.90) 

The  form  structure  and  possible  shapes  of  p(l,2:x)  are  shown  in 
Figure  6.6.  There  is  only  one  possible  form  change  here  (from  r=l 
to  r=2)  and  the  form  transition  probabilities  are  symmetric  about  zero. 


P(1,2:x) 


@  )p(2,2)M 


(a) 


(b) 


I 


AP(1,2:x) 


«l _ 

!  "2 


t 

I 


aj-i  >  0J2 


a 


(0 

e  (a),  and  ?(l,2:x)  for  (6.1)— (6.2)  where 


We  seek  to  minimize 


min 


N— 1 


V 


/U, 


N-l 


J  IVlrk'%i«lrw,lt  WV 


(6.91; 


Here  for  each  j=l,2,  the  following  parameters  are  all  finite 


Q(l)>  0 
Q(2)>  0 

Vj)-  0 

R(j)>  0 
b(j)*  0 
a(j)>  0 

a  >  o 


(6.92) 


and 


0  <  w  <  1 
Q  <  u2  <  1 


(6.93) 


From  the  symmetry  of  the  form  transition  probabilities  (6. 88) -(6. 90) 
and  costs  (6.91)  about  x*zero,  it  is  clear  that  the  expected  costs- 
to-go  (x^, r^)  will  be  symmetric  about  zero. 

Note  that  this  class  of  example  problems  includes  example  5.1 


as  a  special  case. 


Note  that  once  the  system  enters  form  r=2,  it  stays  there.  Thus 


the  usual  L Q  theory  yields  the  following: 

Vk(xk,rk=2)  =  \  Kk(1:2)  k  -=N ,  N- 1 , . . . ,  Q 

uk(W2)  =  ^k^12^  k=N-l,...,0 

where 

^(1:2)  =  Kt(2) 

a2  (2) R(2)  [K.  .  (1 : 2)  +Q  (2)  ] 

K  (1:2)  =  - - - — - 

R(2)+b  (2) <l:2)+£(2) ] 


(6.94) 

(6.95) 

(6.96) 


-b(2)a(2)  IK.  .  (1:2)+Q(2)] 

L.  (1:2)  =  - r - — -  .  (6.97) 

R(2)+b  (2) IK^  1(1:2)+Q(2)] 


Since  b(2)^0,  (1 : 2)  converges  monotone ly  as  (N-k)  increases,  to 


where  K.  (1:2)  decreases  as  (N-k)  increases  if  K_(2)>  K  (1:2)  and 
X  -  T  00 

increases  if  K  (2)<  K  (1:2). 

'P  OO 

Now  consider  what  happens  when  r  =1.  We  are  given  that 

N— 1 

at  time  k=N, 

V„(xM,r  =D  *  x*  K  (1)  .  (6.99) 

N  N  N  NT 

From  sections  6.2  and  6.3,  the  endpieces  and  middlepiece  of 
V^(xk,rk=l)  and  (x^r^*!)  are  given  by 


..Re,  >  „Le  ,  ..  2  Le,,, 

vk  'V11  ■  \  ‘V11  *  W  111 
“f'V11  '  ^‘V11  ■  -tke‘llxk 

and 

RM  LM  2  LM 

\  (V1}  -  vk  <V1}  -  W  (1> 

RM,  LM,  rLM,_. 

\  'V11  ’  \  <Vl)  '  (1,\ 


where 


K^(l)  -  K^U)  -  K^(l)  »  K^(l)  -  Kt(1) 


and 


Le  Re  LM  RM 

(1)  2  (1),  (1)  =  k£"(1> 


(6.100) 

(6.101) 

(6.102) 

(6.103) 


are  given  by 


(6.104) 


K^d)  *  (l-ta^)  (^(D+fid))  +  u1(Kk+1(l:2)+Q(2)). 

j  ^  R6 

Since  b  (1)5*0,  the  steady-state  quantities  Kw  (1)  =  (1)  and 

K^d)  =  K^d)  are  finite  and  positive,  satisfying 
00  00 


(6.106) 


a  (1)R<1) 


RM  LM 

K"”(l)  =  K~(l)  = 

00  CO 


L  \Q(1)  /  \Q(  2)  /- 


R(l)+b2(l)  (1-oj1)/K«»  U)  j  +  w1/K°°(1:2) 


(6.107) 


The  partition  of  ^  values  specified  in  step  1  of  Section  5.4  is 


A(l)  *  (Ym(0) /Ym(1) )  =  (-®,-ct) 
N  N  N 

An(2)  =  (Yn(D,Yn(2))  =  (-Qt,ct) 

A  (3)  *  (Ym(2),Ym(3))  =  (a,®) 

N  N  N 


as  shown  in  Figure  6.7. 

The  condj  tional  expected  cost-to-go 


Vi-11 


W*’  +  W*1  +  V*’ 


for  x  e  A  (t) 
N  N 


,(1)  -  «*«> 


(1-0J2)  (kt(1)+q(1)) 


+  0)2  (Ktr(2)+Q(2)) 


K^(2)  =  (1-^)  (Kt(1)+Q(1)) 


+  0),  (K  (2)  +0  (2) ) 


The  superscript  "1"  is  not  used  in  AN(t),  KN(t),  etc.,  in  this 
section  since  we  are  only  considering  form  1. 


can  take  the  two  possible  shapes  shown  in  Figure  6.8,  depending 
upon  the  values  of  e^,  w 2>  Q(l),  Q(2),  KT(1)  and  KT(2).  Vfe  will 
consider  each  case  in  turn. 

Case  1:  If 

[(K  (2)+Q(2))-(K  (1)+<J(1))]>0  (6.108) 

hence 

KtJ(l)>  Kn(2)  (6.109) 

then  V  (x^lr^^l)  is  as  shown  in  Figure  6.8(a).  The  conditions 
(6.108),  (6.109)  are  met  when 

•  oj2  >  the  probability  of  the  form  change  is 

greater  away  from  zero  than  near  it 

.  K  (2)+Q(2)>  K  (1) +Q (1)  the  cost  charged  at  time 

T  T 

N  is  greater  in  form  2 
than  in  form  1. 

This  corresponds  to  regulation  problems  in  failure  prove  systems.  The 
system  is  operating  normally  when  r*l  and  has  failed  when  r=2.  A 
higher  cost  is  charged  in  the  failed  mode  than  in  normal  operation, 
and  the  probability  of  failure  is  greater  away  from  the  regulator 
goal  of  zero  than  near  it. 


Example  5.1  illustrates  this  situation.  Conditions  (6. 108) -(6. 109) 
are  also  met  when 

•  >  W2  the  probability  of  the  form  change  is 

greater  near  zero  than  away  from  it 

•  K,j,(l)+Q(l)>  K^(2)+Q(2)  the  cost  charged  at  time  N 

is  greater  in  form  1  them  in 
form  2. 

Case  2;  If 

(Wj-W  ) [(Kt(2)+Q(2))-(Kt(1)+Q(1))]>  0  (6.110) 

hence 

Kn(2)>  K^U)  (6.111) 

then  we  have  the  situation  shown  in  Figure  6.8(b).  Conditions 
(6. 110)- (6. Ill)  sure  met  in  problems  where  the  probability  of  transi¬ 
tion  form  r*l  to  r=2  is  at  "cross  purposes"  with  the  cost  structure. 
The  case 

CO  >  (0 

1  2 

KT(2)+Q(2)>  K^(l)+Q(l) 

corresponds  to  the  probability  of  "failure"  (i.e.,  changing  to  the 
higher  cost  form)  being  higher  near  the  regulator  goal  of  zero  than 
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away  from  it.  The  case 


“2  >“L 

Kt<1)+Q(1)  >  Kt(2)+Q(2) 

corresponds  to  the  probability  of  "success"  being  lower  near  the 
regulator  goal  than  away  from  it.  As  we  will  see  in  the  next  section, 
the  "cross  purposes"  of  the  form  transition  probabilities  and  costs 
can  lead  to  local  minima  of  the  expected  costs-to-go. 


6. 6  Last  Stage  Solution 

In  this  section  we  develop  the  last-stage  solution  for  the  two 
cases  of  the  last  section.  The  solutions  of  these  one- stage  problems 
illustrate  certain  basic  qualitative  properties  of  x-dependent  JLQ 
controllers. 

Using  Appendix  C. 1  we  find  that  for  both  cases  of  the  last 
section, 

/  b2(l)i^(2)\ 

Vl121  •  -“l1*  -(I)'-  -  )  (6-U2> 


Vl<2> 


Vi(3> 


=  all+ 


a|i+ 


b2(l)KN(2) 

Taj 

^(1)^(3) 
R  (1) 


■) 

) 


(6.113) 

(6.114) 


0..  .  (1) 


L  b2,1.i>(1) ) 


(6.115) 


and  from  Proposition  5.2  we  find  that 


candidate  costs-to-go  must  be  considered.  For  Case  1 
(Kjjd)  B  K^(3)>  (2) )  these  candidates  sure 


1,U  2,U  3,U  2,L  2,R 

Vl'  Vl'  Vl'  Vl'  UN-1 


if  kn(1)  -  *Si(3>>  V2) 


(i.e.,  suid  eliminated) 

N-l  N-l 

suid  the  9‘s  arnd  0's  in  (6. 112) - (6. 116)  satisfy 


0  .  (1)<  0  (2) <  0  .  (2) <  9  (3) 

N-l  N-l  '  N-l  N-l 

For  Case  2  (K^d)  =  (3) <  £^(2))  the  csmdidate  costs  are 


.  ,.1,U  ,.2,U  3,U  3,L  1,R 

Vi'  Vi'  Vi'  Vi'  Vi 
if  V2)>  V1}  EV3) 


(i.e. ,  V2 ' ^  and  V2 ' ^ 
'  '  N-l  N-l 


r2'f  eliminated) 
N-l 


9..  .  (2)  <  0.  .  (1)<  0..  .  (3)  <  0.  .  (2) 


(6.116) 


(6.117) 


(6.118) 


(6.119) 


(6.120) 


The  eligible  costs  for  V  ,  (xv,  _,r  *1)  over  various  xlt 

N— 1  N— 1  N-l  N— 1 

values  are  shown  for  each  case  in  Figure  6.9. 

From  Figure  6.9(a)  it  is  evident  in  Case  1  there  are  intervals 

of  x  .  values  over  which  the  optimal  controller  must  involve  hedging 
N— 1 

to  a  point.  These  are ' 


We  will  now  determine  V,  , (x„  , ,  r  =1)  for  these  case  1  problems 

N-l  N-l  N-l 

where  the  costs  and  form  transition  probabilities  are  not  at  "cross 
purposes".  That  is,  where 

yi)  =  y3»  y2)  . 

We  have  already  computed  (xn-l'rN-l*D  '  UN-1  ^N-l^N-l*1*  411(5 

xn(xn  ^ , rN  ^=*1)  for  an  example  problem  of  this  type  in  Section  5.3.  Th< 
same  steps  detailed  there  (with  the  shortcuts  described  in  Section  5.6) 
yield  the  followings 

Fact  6.9  :  When  K^(l)  = 

and  control  laws  have 

m^U)  =  5  (6.121) 

pieces,  joined  at  x  values 


K^(3) >  y2)  (Case  1),  the  optimal  cost-to-go 


e^/ad)  e£',/o(i)  o  e;t'/a(i)  ejfyao) 

(b)  CASE  2 

fure  6.9:  Eligible  costs  for  V..  ,  (x._  .  ,r._  .=1)  when 
1  ^  N-l  N“1 

(a)  Kn(1)=K>j(3)>  Kn(2)  (case  1)  and 

(b)  Kn<2)>  Kn(1)  =  K^O)  (case  2). 


where  (1)=5  and  <5  (0)=-«,  6  (5)=-h»  with 

N-l  N-l  N-l 


Vi(1:1) 


a  (1)R(1)K(1) 
_ N 

R(l)+b2(l)K  (1) 
N 


-  Vi(5:1> 


Vi!2il) 


a-  }  =  K  (4:1) 

b2(l)  N"1 


Vi(3:1) 


a  (1)  R(l)  K  (2) 

_ N 

R(l)+b2(l)KN(2) 


rT  _  2a(l)R(l)a  _  „  , 

Vi(2:1)  — 2 -  ‘Vi(4:1) 

b  (1) 


H  .  (1:1)  =  H  (3:1)  =  H  .  (5:1)  =  0 
N-l  N-l  N-l 


GN-1(2:1)  =  ”2 -  (R(l)+b2  (1)^(2))  =  Gn_1(4:1) 


Vi(1:1)  =gn-i(3:1)  =  gn-i(5:1)  =  ° 


The  optimal  control  law  is 


ViVi'Vj-11 


for  Vi(t_1,<Vi<Vi(t> 


with 


ln-i(1:1> 


adJblDK^d) 
R(l)+b2  (1)^(1) 


-  ln-i(5:1) 


(6.128) 

(6.129) 

(6.130) 

(6.131) 

(6.132) 

(6.133) 

(6.134) 

(6.135) 


(6.136) 


"  b(l) 


(6.137) 


a (1)  b (1)  k  (2) 

W3:1)  - - — 

R(l)+b  (1) K  (2) 
N 


(6.138) 


Vl(2:1)  -  bii)  =  -''n-i14’1’ 


(6.139) 


Vi(lil>  -  Vil3i11  ‘  *  0 


(6.140) 


The  value  obtained  by  application  of  the  optimal  control 


law,  as  a  function  of  x  ,  is 

N-l 


's'Vi'Vi'11  • 


ll(1,-Vi(tdlIVi’Viltal 
for  Vl(t-1)<Vl<Vllt> 


(6.141) 


hence 


a(l)R(l) _ 

2  *  ^N— 1 

R(l)+b  (l)K(l) 

N 


Wi'Vi'11, 


a(l) R(l) _ 

2  ~  N-l 

R(l)+b  (1)K^(2)  N  ± 


(6.142) 


a(l)R(l) 

- 2 - - -  Vl 

R(l)+b  (Die  (3) 


VN-f{xN-1 » rN-l =  1  ^ 


decreases 


*VN-1  , 


I  Hedge  | 
1  to  I 


V3,U 

N-1  “ 


•»«  /, 


■  ^TTT 

■  ^p[mi>+b*(tiSM(i)]  i 


R(1 ) -*- b  ( 1 ) Kn(2) 


R(l)+b2(i)KN(l) 


e  6.10s  V.  .  (x  .  ,,r.  .=1)  when  K_(l)  =K..(3)>  K.(2).  (case  1). 
-  N- 1  N- 1  N- 1  N  N  N 

The  optimal  candidate  cost  function  over  each  region  of 
xxl  .  values  is  indicated  by  the  solid  line. 


<n 


In  Figure  6.10  V  (x  .,r  =1)  is  shown  for  this  case. 

N—  1  N— 1  NW1 

The  slope  of  (xN_^, rN-l*^  decreases  discontinuously  at 

6  (1)  and  6  (4) ,  and  is  continuous  elsewhere.  For 

N-l  N-l 

x(J_ie(6N_1(l) ,  5n_i(2)  )  the  optimal  controller  actively  hedges  to 


XN  ~  ~a  and,  as  is  evident  in  Figure  6.10,  the  resulting  optimal 

expected  cost-to-go  is  a  quadratic  interpolation  between  V^'  ^  and 

N— 1 

2,U 

V  .  Similarly,  the  optimal  expected  cost-to-go  over 


2  U 

xn_^€(<5n_^ (3)  ,  5N_.^  (4) )  is  a  quadratic  interpolation  between 
.3,U 


and  V, 


N-l' 


The  width  of  these  one- step  hedging  regiore is 


(6.143) 


Thus  we  see  that  the  widths  of  these  regions: 

2 

.  increase  as  the  "control  effectiveness"  b  (1)  in 

R(l) 

form  1  increases.  (Thus  we  have  more  hedging  to  a 
point  when  the  control  cost  is  low  then  when  it 
is  high  and  we  have  more  hedging  when  the  input 
gain  is  large  than  when  it  is  small) . 

.  are  linearly  related  to  the  ratio  of 

/ the  distance  of  the  point  we  are 

\ hedging  to  from  zero _ 

the  open  loop  dynamics  in  form  1\ 
a(l)  ) 
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In  other  words  the  more  stable  the  open  loop 

system  is,  the  smaller  the  range  of  x„  ,  values 

N-1 

where  hedging  to  a  point  is  optimal. 

.  increase  as  the  difference  in  costs  between  the 

"good"  and  "bad"  sides  of  the  V  (x  I  r  *1) 

N  N  N-1 

A  A 

discontinuities,  K  (1) -K  (2),  increases.  Thus 
N  N 

if  the  savings  obtained  by  hedging  sire  very 

small  (K.(1)«K  (2)),  the  range  of  x  ,  values 
N  N  N-1 

where  hedging  to  a  point  is  optimal  also 
becomes  small. 

In  figure  6.11,  u„  - (x„  ,,r  =1)  is  shown  for  b(l)>0  (if  b(l)<0,  the 

N”  1  N—  i  N—  1 

graph  is  flipped  around  the  xN_^  axis) .  The  control  law  increases 

discontinuous ly  at  x„  ,  =  5  , (1) ,  where  the  optimal  strategy  changes 

N- 1  N~  1 

from  driving  x^  into  ( 1 ) ,  to  hedging  to  point  -a  €  ( 2 ) . 

Similarly,  u,  . (x„  ,  ,r„  =1)  increases  discontinuously  at  x„  *6.,  .(4). 

N—  I  N—  1  N- 1  N- 1  N- 1 

At  all  other  values  of  x„  ,  it  is  continuous. 

N-1 

In  Figure  6.12  the  values  of  obtained  by  application  of  the 
optimal  control  law  is  plotted  (as  a  function  of  x^_^) .  From  (6.142) 
and  Figure  6.12  we  can  deduce  that 

.  the  optimal  closed  loop  system  is  more  stable  than 
the  open  loop  system  in  form  1  for  case  1  problems. 

That  is,  the  optimal  controller  "brakes"  the  open  loop  system  dynamics. 
To  see  this  note  that  over  the  regions  of  x^  ^  values  that  do  not 


;6 


Thus  the  closed  loop  system  is  more  stable  than  the  unforced  system. 

Note  that  there  are  two  regions  of  x„  values  that  are  avoided  by 

N 

the  optimal  controller: 


N 


a,  a  l 


R(l)+b*(l)KN(2) 

R(l)+b2(l)K>J(15 


N 


R(l)+b  (1)K^(2) 


R(l)+b  (DKjjd) 


(6.144) 


The  width  of  each  of  these  regions  of  xn  avoidance  is 


a  <  a(v 


R(l)+b  (1)Kn<2) 


<2a 


R ( 1 )  +b  (DK^d) 


(6.145) 


Thus  the  widths  of  these  regions  of  avoidance  are 

.  linearly  related  to  the  distance  of  the  point 
we  are  hedging  to  from  zero  (i.e.,  a) 

.  increase  as  the  savings  from  hedging  increases. 

.  increase  as  the  control  effectiveness 

bi(l) 


in  form  1  increases. 


Each  region  of  avoidance  here  is  associated  with  a  joining  point 

where  the  slope  of  V  , (x„  ,,r  =1)  decreases  discontinuous lv 

N- 1  N- 1  N- 1 

(i.e.,  with  6.,  ,  (1)  and  ,  (4) ) .  These  are  the  x..  ,  values  where 
N-l  N-l  N-l 

two  candidates  costs  cross. 

We  now  examine  Case  2  problems  where  the  x-costs  and  form 
transition  probabilities  are  at  "cross  purposes."  That  is,  where 

V2)>  V1’  5  V3> 

as  in  Figures  6.8(b),  6.9(b). 

The  eligible  costs  for  u  ,  (*,,  ,  ,  r  =1)  for  this  case  are  shown 

N—  1  N—  1  N—  1 

in  Figure  6.13  (as  in  Figure  6.9(b)).  By  the  following  arguments 
(each  indicated  on  Figure  6.13),  we  can  eliminate  many  of  these  candidate 
costs  from  consideration  over  certain  XN-1  regions; 

1.  As  noted  earlier  (in  6.117),  Proposition  5.2 

2  L  2  R 

eliminates  V  ' ,  and  V  ' ,  from  consideration 
N—  1  N-  i 

(costs  corresponding  to  hedging  to  the  "wrong" 
sides  of  ^(XjJr^^l)  discontinuities). 

2.  KJJ(2)>  1^(1)  implies  that  (2)  hence 

<  V2'?  (as  functions  of  x„  ,).  So 
N-l  N-l  N-l 

2,U 

V  is  not  optimal  over 
N-l 

x  €(9  (2)/a(l) ,0  (2)/a(l) ) .  Similarly, 


0N_,(2)/a(1)  0N.,(1)/o(1)  0  0N_il3,/a{1)  eN.,(2)/a(1)  *N'1 


Figure  6.13:  Eliminating  candidate  cost-to-go  functions  over  different 

regions  of  xN_^  values  when  (2)  >K^  (1)  =K^  (3)  (case  2); 

(n)indicates  that  the  specified  candidate  is  eliminated 
over  this  interval  of  x„  ,  values  by  step  n  in  the  text. 


The  intersection  in  (6.146)  that  is  greater  than  9  _  (2)/a(l) 

N— 1 


W21  ,  r 

Vl  -  ad)  [  \  Sn-1<2> 


This  is  the  point  at  which  the  optimal  candidate  cost  changes  from 

1. R  2,U 

VN  ^  to  VN  1  *  Similarly,  the  optimal  candidate  cost  changes  from 

2, U  3, L 

V  to  V  at 
N-l  N-l 


ViLa ,  J,  Vi131 

*N-1  a(l)  \  9N-1<2> 


Collecting  all  of  this  information  yields  the  following: 


Fact  6.10:  When  (2 ) <  K^(l)  S  K^O)  (Case  2),  the  optimal 

cost-to-go  and  control  laws  have 


Vi(1)a5 


(6.147) 


pieces  joined  at  xN  ^  values 


W1’ 


Wl> 


i (1)  Rd)  lR,1,+b  ^V1’ 1 


.  ,,,  .  Vi(2)  ,  J 1  Vi111 

{r-1(2)  all)  1_  e  <2) 

N—  X 


(6.148) 


(*<D+b2(l)V2))  1-\1- 


R(l)+b  (1) K  (1) 
_ N 

R(l)+b2(l)KN(2) 


(6.149) 
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Vi'3’  -  -W2’ 


W4’  -  -W11 


(6.150) 

(6.151) 


The  optimal  candidate  costs-to-go  are 


Vi(Vi'Vi=1)= 


if 

Vi  i  W1’ 

'Vi'11 

if 

Vl(1,i  Vli  Vl(2> 

'Vi-11 

if 

Vi,2>i  Vi-  W3) 

Vi-11 

if 

Vl'3*  Vl-  SN-1(4> 

if 

Vi<4)-  Vi  .  (6 

(6.152) 


Thus  the  optimal  expected  cost-to-go  is  given  by  (6.127)  where 
{K^_1(t:l),  H^^tt:!):  t«l, . . .  (1)  =5  }  are  the  same  as  for  the 

earlier  case  (as  in  6.128)- (6.152) )  except  that 


(6.153) 


Vi,J,u  «d>*b2u>yi>>  -  Vi(4:1) 

D  ID 

Vi(Lsl)  =  Vi(3:1)  -  Vi(5"1)=0  • 


The  optimal  control  law  is  given  by  (6.135)  where 

{LN_1(t:l) :  t=l, . . . (1)=5}  is  the  same  as  for  the  earlier  case 

(as  in  (6. 136) -(6. 138) )  except  that 


W2sl) 


-a 


b(l)  =  'FN-1(4:1) 


(6.153) 


Vi(1:1)  =  Vi(3:1)  =  Vi(5:1)  =  0  • 


The  ^  value  obtained  by  the  application  of  the  optimal  control 
law,  as  a  function  of  x^  is  given  by  (6.141).  Hence 


a  (1)  R(l) 


R(l)+b  (1)K  (1) 


Vi 


VVi'Vi*11". 


-Ot 

a(l)R(l) 


x 


R(l)+b2(l)K  (2)  N“1 

N 

+ 

a 

a (1)  R(l) _ x 

R(l)+b2(l)KN(3)  N_1 


(6.154) 


□ 


We  see  that  for  this  case,  hedging  is  to  the  other  side  of  -a  and 

+a  (since  V„ (x  Ir  =1)  is  now  less  on  the  opposite  side  of  these 
N  N  N—  1 

values).  In  Figure  6.14,  u„  , (x„  ,  ,r„  =1)  is  shown  for  b(l)>  0. 

N  N~  I  N“  1  N- 1 

The  control  law  increases  dis continuously  at  x„  . (2),  where  the 

optimal  strategy  changes  from  hedging  to  the  point  -ot  e  A^(l)  to 

driving  x„  into  A  (2).  Similarly,  u„  .  (x„  ,  ,r  =1)  increases  at 
N  N  N-l  N-l  N-l 

x..  ,=6..  ,  (3) .  Elsewhere  it  is  continuous. 

N-l  N-l 
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The  value  of  x^  obtained  by  application  of  the  optimal  control 

law  to  x„  .  is  shown  in  figure  6.15.  Recall  that  in  the  earlier  case 
N— 1 

(where  K  (2)<  K  (1)  -  K  (3))  we  saw  that  the  optimal  closed  loop 
N  N  N 

system  was  more  stable  than  the  open  loop  system  in  form  1.  This  need 

A  A  A 

not  be  true  when  ( 2) >  K^(l)  -  K^(3).  In  particular,  the  optimal 
controller  in  form  r=l 


is  more  stable  than  the  open  loop  system  for 

values  from  which  we  do  not  hedge  to  a  point. 

may  be  more  stable  or  less  stable  them  the  open 

loop  system  over  x„  ,  values  from  which  we  hedge 
N-l 

to  a  point,  depending  upon  the  values  of  the 


quantities 


b2(l) 


and  (K  (2) ,-K  (1) ) 
N  N 


To  see  this  we  note  that  (as  in  the  earlier  case)  for  xmT  .  regions 

N—  1 

where  the  slope  in  Figure  6.15  is  positive,  the  closed  loop  dynamics 


1+  b  (1)  K  (i) 
1  R(l)  KN(1) 


<  a(l)  i-1,2,3  . 


But  in  the  hedging  regions  x.T  ,€(6„  ,  (1)  ,  5„  ,(2))  and 

N-l  N-l  N-l 


x„  ,€(<5  ,(3),  ,(4))  the  optimal  controller  is  more  stable 


In  these 


if  a(l)  lxN_1l>  a  less  stable  if  a(l)  |x  j<  a. 
hedging  regions 

IVl12^  IvJ  <  IWU  I 

hence  from  (6. 148) - (6. 151) , 


That  is,  the  optimal  closed  loop  system  will  be  less  stable  than 

the  open  loop  system  in  form  1  for  some  .  values  where  the 

N-l 

optimal  strategy  is  to  hedge  to  a  point,  if  and  only  if 


300 


(1)  The  excess  cost  (2) -1^(1)  of  being  on 


the  wrong  side  of  the  hedging  point  is 
large  enough 

2 

(2)  the  control  effectiveness  b  (1)  is  small 

R(l) 

enough. 


Note  that  there  are  two  regions  of  avoided  x  values  in  Figure  6.15 

N 

(as  in  the  earlier  case) .  They  are 


R(l)+b  (1)^(1) 


I 


R(l)+b  <1)K  (2)  // 


R(l)+b  (U+K^U) 
R(l)+b2(l)KN(2) 


a 


(6.158) 


The  width  of  each  of  these  regions  of  Xj^  avoidance  is 


a  <  a 


R(l)  +b  (1) K  (2) 

_ _ N _ 

Rd)+b2(i)yi) 


'<  2  a 


(6.159) 


Comparing  (6.159)  with  (6.145)  we  see  that  the  width  of  these 

2 

regions  of  avoidance  varies  with  a,  b  (1)/R(1)  and  the  savings  from 

A 

hedging  (here  K  (2)-K  (1) ) ,  as  in  the  previous  case. 

N  N 

Each  region  of  avoidance  is  again  associated  with  a  joining  point 

where  the  slope  of  V  (x  ,r  =1)  discontinuously  decreases  (i.e., 

N-l  N— 1 

5  (2)  and  6  .(3)  in  this  case). 

N—  1  N—  X 


There  are  two  different  shapes  that  the  expected  cost-to-go 

VN-1  ^XN-l'rN-l=1^  can  taken  when  -  ^(3).  These  are 

shown  in  Figure  6.16.  In  both  Figures  6.16(a)  and  6.16(b),  the  slope 

of  V  .  (x„  _,r  «1)  has  a  negative  discontinuity  at  6..  ,  (2)  and 

N“  1  N-l  N— 1  N— 1 

5m.  ,  (3) ,  and  is  continuous  elsewhere.  For  x„  ,  e(6„  ,  (1),5V.  ,(2))  the 
N-l  N-l  N-l  N-l 

optimal  controller  actively  hedges  to  xv,  =  -a  .  The  resulting  optimal 

N 

expected  cost-to-go  is  a  quadratic  interpolation  between 

N— 1 

and  v  Similarly  for  ,€(6  ,(3),  5  .(4)).  In  this  case 

N-l  N-l  N-l  N-l 

(as  opposed  to  Figure  6.10),  the  curve  is  below  the  V^' ^ 

N-l  N-l 

curve. 

The  width  of  the  one-step  hedging  regions  for  this  case  is 


(width  of  one- 
step  hedging 
regions 

(6.160) 


(compare  to  (6.143)).  The  comments  following  (6.143)  regarding  the 
width  of  these  hedging  regions  apply  for  this  case  as  well.  Thus 
when  a/0,  the  optimal  expected  cost-to-go  V  (x  ,r  *1)  for  all 

N—  1  N—  1  N—  1 

problems  of  the  class  formulated  in  Section  6.5  involves  active 


hedging  to  a  point. 


When  Figure  6.16(a)  applies  v  , (x„  ,,r,  =1)  has  a  single 

N—  1  N—  1  N—  1 

local  minimum  at  x^  ^=0,  as  in  Case  1.  But  when  Figure  6.16(b) 

applies  V._  .  (x..  .  ,r..  ,=1)  has  two  additional  local  minima  as  well 
N- 1  N-l  N— 1  -  _____ 

as  the  global  minimum  at  x„  _ =0.  In  this  situation  V  , (x„  , ,r  =1) 

N—  1  N“1  N—X  N— X 

is  not  montone  for  x„  <0  and  x„  .  >0  (Notes  in  Figure  6.16(a) 
-  N“  1  N—  X 

y2>y1  but  in  Figure  6.16(b)  we  can  have  or  ' 

The  following  proposition  states  necessary  and  sufficient  condi¬ 
tions  for  v  (x  ,  ,r  =1)  to  have  local  minima  (for  the  problem 
N-l  N-l  N-l 

of  Section  6.5) : 


Proposition  6.11  (Local  minima) 

Consider  the  problem  of  Section  6.5.  7.,  ,  (x„  ,,r  =1)  has  a 

N—  1  N- 1  N—  1 

single  local  minimum  at  zero  if  and  only  if  the  following  condition 
holds : 


V^-V11  <  bia) 
~  Rll> 


(6.161) 


If  (6.361)  does  not  hold  then  V„  .  (x  ,r  =1)  has  two  local 

N-l  N—X  N— X 


minima  as  well,  at 


each  with  value 


Vl  (a(l)  '  rN-l=1)  VN-l(a(l)  '  rN-l_1)  “  a  KN(1) 


(6.163) 


The  proof  of  this  is  straightforward,  and  appears  in  Appendix  C.10.  Q 


This  proposition  can  be  explained  as  follows: 

V  , (x  , ,r  =1)  has  local  minima  if  and  only  if  the 
N— 1  N-l  N— 1 

following  conditions  both  hold: 


(1)  V2)>  VX)  E  \{3) 

(the  costs  are  at  "cross  purposes") 


and 


(2) 


the  control  effectiveness 


b2(l) 

R(l) 


is  small  enough. 


Thus  the  form  transition  probability  discontinuity  locations, 

+a,  do  not  bear  upon  the  existence  of  these  local  minima  (and  at 
time  k=N-l;  a(l)  does  not  effect  them  either). 

Note  that  the  condition  (6.161)  for  local  minima  is  the  same 
condition  (6.157)  for  the  optimal  closed  loop  system  to  be  less 
stable  than  the  open  loop  system  (for  some  XN_^_  values).  In  particular, 


we  can  derive  a  relationship  between  the  existence  of  local  minima  and 


the  values  of  a(l)  and  a  in  terms  of  the  joining  points  where  the 


slope  of  V  ,  (x  ,,r  =1)  decreases.  Clearly  the  local  minima 

N— 1  N- 1  N-l 

exist  if  and  only  if 


min 

V1'?  - 

< 

<S 

.  (2) 

x  , 
N-l 

N-l 

a(l) 

N-l 

min 

ii 

•4  . 

ro 

> 

-2-  > 

5  . 

.  (3) 

Vi 

N-l 

a(l) 

N-l 

(6.164) 


Thus  they  exist  if  and  only  if 


a (1) 5  .  (2) >  -a 

N— 1 

a(l)5  (3) <  a 

N-l 

which  means  that  the  open  loop  drift  a(l)  would  drive 


(6.165) 


Vi  ‘  Vi(2)' 


x  =5  ,  (3) 

N-l  N-l 


to  the  more  costly  sides  of  x  =-a  and  x  =a,  respectively.  But 

N  N 

(6.165)  holds  if  and  only  if 


b2(l) 

R(l) 


R(l)+b2(l)KN(l) 
R(l)+b2  (1)^(2) 


<  1 


which  illustrates  the  a  independence  of  the  existence  of  these 


local  minima. 


We  cam  extend  these  ideas  to  more  general  x-dependent 
JLQ  problems.  A  necessary  condition  for  the  existence  of  local 
minima  in  V^(x^,r^=j)  can  be  stated  in  terms  of  the  conditional 
expected  cost-to-go  \ (xk+1 |rk=j)  as  follows: 


Proposition  6.12:  Consider  the  problem  of  Proposition  5.1,  where 
all  of  the  x  costs  have  only  quadratic  terms  (i.e.,  S(j)  *  P(j)  = 

yj>  -  yi>-o.  j®>.  if  Vi 1  Vi  I  is  monotonely  nonincreasing 

f°r  £  0  and  monotonely  nondecreasing  for  xJc+1  >_  0,  then 

V^x^r  =j)  has  a  single  minimum. 


Thus  Vk+1  rk«j)  must  be  nonmonotone  if  VJc(x]c  ,rk*j)  has  a  local 
minimum.  This  proposition  follows  direction  from  proposition  5.3  (part 4). 
Note  that  Proposition  6.12  does  not  provide  necessary  conditions  for  tiie 
existence  of  additional  local  minima.  For  example^  in  case  2  problems  where 


b2(l) 
R(l)  - 


y^y21 


A  | 

we  have  V  (x  r  =1)  nonmonotone  for  x  >0  and  x  <0  (as  in  Figure 
N  N  N“1  N  N 

6.8(b))  but,  by  Proposition  6.11,  V  . (x„  ,,r  =1)  has  a  single 

N—  1  N—  X  N- 1 

minimum  at  zero  (as  in  Figure  6.16(a)). 

This  concludes  our  consideration  of  the  last-stage  solution  for 
the  class  of  problems  that  are  formulated  in  Section  6.5.  We  have 


shown  that  : 


V2,-V1)  .  b2d 


^  *  R ( 1) 

V1,V2) 
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7.  This  same  condition  is  necessary  and  sufficient 

for  the  existence  of  local  minima  in  V._  ,  (x„  ,  ,r._  „=1). 

- N-l  N-l  N-l 

6. 7  Summary 

We  have  now  characterized  the  time-varying  and  steady- state 
behavior  of  the  endpieces  and  middlepieces  of  the  optimal  JLQ  controller 
(when  (6.1)-  (6.4) )  hold  and  we  have  obtained  bounds  on  the  expected 
costs-to-go  that  afford  some  description  between  these  pieces.  For 
a  special  class  of  problems  we  have  explored  all  of  the  possible 
behaviors  of  the  last  time  stage  solution  and  have  given  some  indication 
of  the  issues  which  arise  at  the  next  time  stage. 

In  the  next  chapter  we  will  examine  further  the  two  problem  cases 
of  sections  6. 5-6. 6.  Under  certain  conditions  these  problems  have 
easily  computable  solutions  that  will  enable  us  to  gain  insight  into  the 
general  steady-state  behavior  of  JLQ  problems  with  x-dependent  forms. 

An  algorithm  for  solving  the  general  scalar  JLQ  problem  of  Chapter  5 
will  also  be  presented  and  illustrated  by  numerical  examples.  In  addition 
we  will  consider  "finite  look- ahead"  approximations  of  the  optimal  steady- 


state  controller. 


7.  COMPUTATION  AND  TIME-VARYING  BEHAVIOR  OF 
THE  JLQ  CONTROLLER 

7 . i  Introduction 

In  this  chapter  we  conclude  our  examination  of  the  noiseless, 
scalar  x-dependent  JLQ  control  problem  of  chapters  5  and  6  with  a 
study  of  two  topics  in  detail-  These  are 

•  the  efficient  computation  of  the  optimal  JLQ 
controller  of  Proposition  5.1,  using  the  qual¬ 
itative  and  combinatoric  results  established  in 
chapters  5  and  6. 

•  the  time  varying  behavior  of  the  optimal  controller 
(as  the  number  of  stages  from  the  terminal  time 
increases)  . 


In  section  7.2  we  develop  a  solution  algorithm  for  the  general 
problem  of  Proposition  5.1.  It  is  presented  in  flowchart  form  and 
described  in  detail.  The  basic  idea  is  to  compute  the  optimal  cost 
function  V^Jx^r^j)  at  t*®6  stage  k  (and  in  each  form  j)  one  piece 
at  a  time,  starting  on  the  left  (with  the  left' endpiece) .  Using 
Propositions  5.2  and  5.3,  the  number  of  calculations  and  computations 
that  this  solution  algorithm  must  make  is  greatly  reduced  from  those 
of  the  "brute  force"  solution  technique  in  chapter  5. 

The  solution  algorithm  developed  in  section  7.2  is  applicable  to 
all  problems  satisfying  the  requirements  of  Proposition  5.1.  This 


class  of  problems  is  extremely  rich.  The  resulting  optimal  controllers 
can  exhibit  a  wide  variety  of  qualitative  behaviors.  Analytical  char¬ 
acterizations  of  these  JLQ  controllers  that  are  sufficiently  general  to 
encompass  the  entire  problem  class  tend  to  be  uninformative,  since  so 
many  diverse  behaviors  must  be  simultaneously  accounted  for. 

We  have  chosen  in  sections  7. 3-7.6  to  focus  on  problems  that 
lend  insight  into  the  kinds  of  qualitative  JLQ  controller  behaviors  that 
are  appropriate  in  fault-tolerant  control  applications.  Our  vehicle  for 
doing  this  is  the  single  form-transition  problem  that  was  developed  in 
sections  6.5  and  6.6.  We  are  particularly  interested  in  comparing  and 
contrasting  the  qualitative  behaviors  of  the  optimal  JLQ  controllers  in 
two  archetypical  classes  of  problems.  In  one  of  these  classes  the  twin 
goals  of  high  performance  and  high  reliability  are  commensurate.  In  the 
other  class  they  are  at  cross  purposes. 

In  sections  7.3  and  7.4  we  illustrate  the  wide  range  of  parametrically 
determined,  qualitatively  different  cases  that  can  arise  even  in  the  single 
form- transition  problem  of  section  6.5.  In  particular  we  find  conditions 
which  imply  that  the  middlepiece  and/or  endpieces  of  the  optimal  expected 
cost-to-go  v^l)  coincide  with  the  upper  and  lower  bounds  of 

chapter  6  (that  is,  they  are  described  by  the  same  function  of  x^)  . 

The  facts  established  in  sections  7.3  and  7.4  are  used  in  sections 
7.5  and  7.6  to  obtain  and  study  in  detail  classes  of  problems  (mentioned 
above)  that  are  representative  of  :* ult-tolerant  control  problem  applic 
ations.  For  these  problems  the  algorithm  of  section  7.2  reduces  to  the 
solution  of  (increasingly  many)  sets  of  difference  equations  (as  (N-k) 
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increases) .  This  makes  these  problems  amenable  to  further  detailed  anal¬ 


ysis  and  it  lets  us  illustrate  some  of  the  controller  properties  and 
qualitative  issues  that  arise  from  the  use  of  control  to  achieve  both 
reliability  and  performance  goals.  We  can  analyze  the  infinite  time 
horizon  behavior  of  JLQ  problems  in  these  two  classes  and  obtain  the 
optimal  steady-state  controllers  as  (N-k)-* «•  since  the  optimal  control¬ 
ler  at  each  time  can  be  obtained  from  che  solution  of  increasingly  many 
difference  equations  without  making  the  comparisons  and  tests  in  the 

solution  algorithm  that  are  needed  in  general. 

The  steady-state  solutions  that  are  obtained  for  these  two  problem 

classes  exhibit  a  structure  that  suggests  a  "natural"  approximation  to 
the  steady-state  optimal  controller  (both  for  these  problems  and  the  gener¬ 
al  class  of  problems  in  chapter  5  that  can  be  made  arbitrarily  close 
to  optimal.  These  approximations  correspond  to  finite  look-ahead  con¬ 
trollers  which  ignore  eventualities  that  occur  beyond  some  fixed  plan¬ 
ning  time.  By  ignoring  the  far  future  optimality  is  lost  in  these  con¬ 
trollers  but  the  computational  burden  of  determining  them  and  the  com¬ 
plexity  and  cost  of  implementing  them  is  reduced.  This  approximation 
idea  is  developed  in  section  7.7.  Finally  in  section  7.8  we  summarize 
the  results  of  Part  III  of  the  thesis  . 

7.2  An  Algorithm  for  the  Off-line  Determination  of  the  Optimal  Controller 

In  this  section  we  develop  an  algorithm  that  enables  us  to  solve  the 
general  scalar-x  JLQ  problem  of  Chapter  5.  This  algorithm  is  based  upon 


application  of  the  one-stage  solution  of  Proposition  5.1  recursively,  back¬ 
wards  in  time,  for  each  form  j€M  that  the  system  can  take. 

The  solution  of  Proposition  5.1  at  a  specific  time  k  and  from  a 
specific  form  j  involves  the  computation  and  comparison  of  many  quadrat¬ 
ic  cost  functions.  These  cost  functions  correspond  to  single  time  steps 
of  constrained  in  x  and  unconstrained  JT, Q  problems  with  x-independent 
transition  probabilities,  as  described  in  chapter  5.  Fortunately  many  of 
the  candidate  cost  computations  and  comparisons  that  are  indicated  in  the 
constructive  proof  of  Proposition  5.1  (in  section  5.4)  can  be  avoided,  due 
to  the  qualitative  properties  and  facts  that  we  have  established  in  chapi¬ 
ters  5  and  6.  Our  algorithm  takes  advantage  of  these  results. 

The  basic  ideas  of  the  algorithm  can  be  summarized  as  follows: 

1.  For  each  form  j€M  at  time  k,  we  can  compute 

u^Cx^r^  =  j)  one  piece  at  a  time,  sweeping  from  left  to  right 
along  the  axis  of  a(.j)x^  values.  We  start  with  the  endpiece  of 
V^(x^,rk  =j)  that  corresponds  to  large  negative  values  of  a(j)x^ 
(i.e.,  for  a(j)  >  0  and  (x^,;))  for  a(j)  <  0),  since 

we  know  that  this  endpiece  is  optimal  for  sufficiently  negative 
a(j)x^  (from  Proposition  6.1). 

2.  As  we  sweep  rightwards  along  the  a(j)x^  axis,  we  compare  the 
solutions  of  each  of  the  constrained-in-x^+^,  x-independent  JLQ 
control  problems  of  step  3  in  section  5.4.  The  optimal  cost 

Wr 

constrained  problem  solutions,  evaluated  at 


=  j)  at  each 


*k 


value 


is  the  minimal  value  of  these 


(xfc,rk  =  j)  and 


We  will  say 


that  a  quadratic  cost  function  is  valid  over  a  specific  interval 
of  a(j)x^  values  if  it  solves  a  constrained  problem  of  step  3 
section  5.4  over  this  interval.  That  is 


V. 


t,L 


is  valid  for 


\  i 


9k(t) 
a(  j) 


9,.  (t) 


is  valid  for 


: -  <  x,  < 

.(j)  -  Tc  - 


a(  j) 


,R 


is  valid  for  x^  _>  9^(t)/a(j) 


Thus  the  list  of  valid  costs  changes  as  we  sweep  rightwards  along 
the  aCjJx^  axis.  In  each  successive  region  of  a(j)x^  values  we 
need  only  look  at  those  quadratic  cost  functions  that  are  valid. 


At  each  point  we  have  a  prevailing  optimal  cost  which  is  the 
optimal  cost  for  points  immediately  to  the  left.  As  we 
proceed  from  left  to  right  along  the  a(j)x^  axis  we  must 
decide  when  this  prevailing  cost  ceases  to  be  optimal,  and 
which  valid  candidate  cost-to-go  function  becomes  the  new  prevailing 
optimal.  The  old  prevailing  cost  can  cease  to  be  optimal  if  it 
is  crossed  by  another  valid  candidate  cost  (which  thereafter  becomes 
the  prevailing  optimal  cost  as  we  continue  the  rightward  sweep 
of  the  algorithm) ,  or  if  it  ceases  to  be  valid.  In  the  latter 
case,  the  newly  valid  cost  function  (that  replaces  the  former 


prevailing  optimal  cost  function  as  the  solution  of  one  of  the 


constrained  JLQ  problems  of  step  3  in  section  5.4)  becomes  the 


new  prevailing  cost.  Thus,  as  we  sweep  rightwards  along  the  axis 
of  a(j)x^  values  we  only  need  to  compare  valid  candidate  cost 
functions  to  the  prevailing  one. 


Furthermore,  from  Propositions 5 . 2  and  5.3  we  know  that  not  all 

of  the  valid  cost  functions  in  a  given  region  of  a(j)x^  values 

are  eligible  for  optimality.  3y  eligible  we  mean  that  the 

candidate  cost  function  (of  x^)  in  question  has  not  been  ruled 

out  by  Proposition  5.2  (at  the  beginning)  or  by  Proposition  5.3 

(as  the  algorithm  progresses).  Recall  that  Proposition  5.2 

disqualifies  from  optimality  (for  any  x^)  all  candidate  cost 

functions  except  the  unconstrained  V^'U(x^,j)  costs  and  those 
t  L  t  R 

constrained  (x^,  j),  (xk,j)  costs  that  correspond  to 

driving  x^^  to  the  less  expensive  side  of  a  vK+1(rk  =  3) 
discontinuity.  Also  recall  that  the  mapping  from  xk  to  the 
optimal  choice  of  x^_^  : 


xk  I — i  xkU(Vrk  *  j) 

is  monotone  (see  Proposition  5.3).  As  we  sweep  from  left  to 
right  along  the  axis  of  a(j)x^  values,  this  fact  can  also  be  used 
to  remove  candidate  cost  functions  from  eligibility.  We  can 
exclude  from  further  consideration  those  candidate  cost-to-go 
functions  that  correspond  to  driving  x^  to  the  left  (in 
a ( j )  >0;  to  the  right  in  a(j)  <  0)  of  where  the  prevailing 
controller  does.  In  this  way,  Proposition  5.3  is  used  to  reduce 
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I* 


the  list  of  eligible  candidate  cost  functions  as  we  sweep  rightwards. 

Thus  the  algorithm  proceeds,  for  each  form  jeM  and  at  each  time  k, 

by  sweeping  rightwards  along  the  a(j)x^  axis,  comparing  valid,  eligible 

candidate  cost  functions  to  the  prevailing  one.  This  process  begins 

Re 

with  the  appropriate  endpiece  (V^^x^/^))  a0)  51  Vk  ^ 

a(j)  <  0)  and  ends  when  the  other  endpiece  becomes  the  prevailing  optimal 
If  the  problem  is  completely  symmetric  about  zero  (i.e.,  all  costs  and 
form  transition  probabilities)  then  the  sweep  need  only  proceed  until  the 
middle  piece  is  reached. 

An  overview  of  the  solution  algorithm  is  shown  ^  in  figure  7.1.  The 

algorithm  is  initialized  with  the  terminal  time  (k  =  N)  cost  parameter 

(block  1) .  Then  for  successively  decreasing  times  through  k=  k 

o 

(block  2),  the  one-stage  solution  of  Proposition  5.1  is  obtained  for 
each  form  j6M  (block  3) . 


The  symbol  :  =  in  figure  7.1  denotes  replacement.  For  example, 
j  :=  2j  +  i  +  1  means  that  the  value  of  variable  j  is  replaced  by 
the  value  of  the  expression  2j  +  i  +  1. 


L6 


Before  discussion  how  the  solution  algorithm  accomplishes  this 


determination  of  V  (x  .r  =j)  and  u  (x  ,r  =j)  let  us  recall  the  steps 
*  k  K  k  k  iC 

for  doing  this  that  were  specified  in  section  5.4.  We  will  then 

indicate  how  these  tasks  can  be  simplified.  The  steps  in  section  5.4 
% 

were: 


Step  1:  A  composite  partition  of  the  real  line  (of  x  .  values)  is 
- —  k+1 

obtained,  consisting  of 


Ci 


nonoverlapping  intervals 


Cl1*’  ■  ‘Cl  (t-» '■'£♦!<«>  C  ’  1"--  Cl 


where  y^lO,  &  -  »,  &  - 

and  the  grid  points  (t)  :  t  -  1,...,  4^+1  -  l}  . 


This  -is  done  by  superimposing  the  grids  of  the  next-time 

expected  costs-to-go  v  (x  . ;  r  =  i)  joining  points  : 

k+1  k+1  k+1 

{<Sk+lU)  s  1  “  1'*‘"\+1'(i)"1} 


and  from  the  transition  probability  discontinuity  locations 


(v  .  .  U)  :  l  *  1, . . .  ,v  . .  -1  } 

gi  31 

for  all  i  in  the  cover  of  j  (i  6  C  )). 

Step  2:  For  each  of  these  intervals  (t*l,...,  t^+1)  a 

constrained-in  x^f^  JLQ  problem  is  formulated: 


Wk^l’w 


e  a-  ,  (t)  =  nun 


xkH  S  Ak+l(t) 


^  Vi(ltkulv3)) 


Step  3 ;  These  constrained  problems  are  then  each  solved.  Their  solutions 
are  piecewise-quadratic  in  x^  with  three  pieces 


WVj|Vi  e  4Li(t))-/vk'L 


e£(t> 

X.  <  - 

k  -  a(j) 


9k<t>  <  *  <  VII 


x  >  - rrr- 

—  a(j) 


except  for  t  =  1  and  t  =  ^  which  have  only  two  pieces 

(  6^(1)  /a  ( j )  -  -  “  ,  Q^Ol^jyatj)  =  +  00  ). 


Step  4:  The  optimal  expected  cost-to-go  V^Cx^r^a  j)  at  each  x^  is 

the  lowest  of  the  constrained  cost  solutions  (of  steps) 
at  that  x.  . 


A  "brute  force"  implementation  of  step  4  would  be  to  compute  and  find 


the  intersections  of  the  (3^  ,  -  2)  quadratic  functions  of  x.  : 


START 


Block  3  iteration  for  time 
stage  k  and  form  j 


INITIALIZATION  FOR 
THIS  ITERATION 


OBTAIN  THE  composite  partition 

(Step  1  of  section  S.H) 


Superimpose  the  ^xk*r  rk»!  =  ^  joining  points 

R.i  ».-.%♦,«»  - '} 

and  form  transition  probability  discontinuity  locations 

{VJ,:  "} 

for  ail  forms  »«(}  j 
obtaining 

^k  +  1  "  number  of  pieces  in  the  composite 
xk+i  Partition 

:  <  -  I - '♦'j,,,  -  1}  s  grid  point* 


vk'0(Vj)'vl’f'(Vj’'  v2-L(1<K'5>'  v2'R(xk,  j) 


♦Lr1-1  .  -1-u  <1  -1-R  4^ 

••••%  (V3'vk  (*k'3>-  vk  <V31'  v*  (*k,j). 


,  Kl  i 

\  *xk' 


..«w0 


(*k,j) 


At  each  value,  (x^  r-K=j)  is  then  chosen  to  be  the  candiate  having 
the  least  cost,  among  those  that  are  valid  at  this  x  • 

JC 

Now  we  will  develop  a  sequence  of  tasks  that  carries  out  these  four 
steps  in  a  more  efficient  manner.  In  the  following  discussion  we  will 
refer  to  a  flowchart  of  the  algorithm  that  is  shown  in  figures  7.2  -7.5 
All  of  the  steps  indicated  in  this  flowchart  together  constitute  one 
iteration  of  block  3  in  figure  7.1.  That  is,  they  determine  the  one- 
stage  JLQ  solution  that  is  specified  by  Proposition  5.1  for  some  time 
stage  k  and  form  j . 

A  macroscopic  overview  of  the  algorithm  specified  by  figures  7.2  - 
7.6  is  as  follows: 

1.  The  algorithm  first  performs  step  1  (above)  in  block  4  (of 
figure  7.2).  The  composite  Xj^  partition  is  obtained  from  quantities 
that  were  computed  at  the  previous  time  stage  (i.e.  at  time  k+1) , 
and  from  known  parameters  of  the  problem.  If  the  entire  problem  is 
symmetric  about  zero  then  we  need  only  consider  this  partition  for 
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Computing  the  cost 
parameters  for  all 
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to  proposition  5. 2 
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from  K^jit).  H^jlt)  and  £^,(0 
*s  specified  by  (c.l  16)  *  (c.l.  IS) 
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Compute  Hj^tt)  by  (c.t. 22)  for  eligible 
candidate 
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Figure  7.5;  Flowchart,  Part  V:  Comparisons  within  a  9 -©Interval 


Figure  7.6;  Flowchart,  Part  VI:  Moving  Rightwards 


2.  The  next  task  is  to  determine  which  candidate  cost-to-go 


functions  are  eligible  for  optimality  with  respect  to  Proposition  5.2, 
and  to  compute  the  parameters  for  these  eligible  functions.  Recall 
that  Proposition  5.2  excludes  from  eligibility  all  cost  functions  that 
correspond  to  actively  hedging  to  a  point  that  is  not  a  discontinuity 

A 

of  the  conditional  expected  cost-to-go  Vkj_^ 

|r^=j)  are  computed.  Using 
these  quantities  the  parameters  are  computed  for  all  candidate  expected 
cost-to-go  functions  that  are  eligible  for  optimality  according  to 
Proposition  5.2.  This  is  done  in  block  15,  as  follows: 


In  block  5  the  parameters  of  ^xk+l 


(5fk+l'rk=j)- 


j  j 

♦  for  each  xk+1  interval  Ak+1(t)  t  =  l,...,ip'+^ 
we  compute  the  parameters  for  the  "unconstrained" 
cost  function  v£,U  (block  3) 

•  then  at  each  grid  point  (Y^+1(t)  t  =  l,...,ijp+^  -  1}  we 


test  to  see  if  is  discontinuous 

(blocks  10,11,13).  If  y^+^(t)  is  3  discontinuous 
point  of  the  conditional  expected  cost-to-go,  then  the 
parameters  of  the  eligible  candidate  cost  (corresponding 
to  driving  to  the  low  cost  side  of  are 

computed  (blocks  12,14). 


3.  Then  we  prepare  for  the  rightward  sweep  along  the  a(j)3^ 
axis  by  obtaining (in  block  16  ,  figure  7.4)  the  partition  of  the  real 
line  (of  a(j)xi,  values)  that  is  caused  by  the  points 


{ejui,  t  -  2...,  i£fl> 

These  quantities  are  computed  using  the  values  obtained 

in  block  7. 

4.  Finally,  the  algorithm  determines  the  optimal  cost  (and 
control  law)  over  each  interval  of  a(j)x  values  in  this  9-0 

K 

partition,  starting  on  the  left.  The  algorithm  sequentially  finds 

V  ( j :  j)  =  x2  K  (i:  j)  +  X  H  (i:j)  +  G  (i :  j ) 
k  k  k  k  k  k 

for  1,2,....  ,m  kfl)  when  a(j)  >  0.  When  a(j)  <  0  these  pieces  are 

found  in  reverse  order.  The  same  flowchart  applies  for  both  a(j)  >0 

and  a(j)  <  0  ;  if  a(j)  <0  then  we  start  with  the  right  endpiece  instead  of 

the  left;  endpiece  as  the  initial  prevailing  cost  function  (block  19)  , 

and  we  revise  all  indices  at  the  end  of  the  sweep  (block  34) . 

The  fourth  task  above  constitutes  the  main  body  of  the  solution 

algorithm.  We  will  describe  it  in  detail  below.  Before  doing  so, 

however,  let  us  consider  the  9-0  partition  that  was  obtained 

for  example  5.1  at  time  k  =  N-2 .  We  will  use  this  example  throughout 

this  section  to  demonstrate  the  algorithm's  steps. 

Example  7.1  (example  5.1  t  6.1  revisited) 

The  candidate  cost-to-go  functions  for  VN_^(  XN-2,rN-2  =  ^  tiiat  are 
valid  and  eligible  at  the  start  of  the  left-to-right  sweep  are  shown 
in  figure  7.7.  The  seven  rows  correspond  to  the  constrained-in-xN_^ 
optimal  costs 
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-31.22  -30.98  -1254  -4.56  -3.277  0  3.277  4.56  12.54  30.98  3122  o(1)xN.2 

8n.2(2)  8n.2!3)  0n.2(4)  0n-2(5)  0N.2( 6)  0N.2(7) 

®N-2^)  ®N-2^2  ®n-2(3)  ®N-zW  ®N-2^5)  ®N-2(6) 


Figure  7.7;  Valid,  Eligible  Regions  of  VN_2 ^xn-2 ' rN-2=1^  Candidate 


Cost  Functions  in  Example  5.1,  6.1 


N-2 '  N-2 '  N-2 


I 


N-l 


v  W  i 


XN— 1  S  A 


for  t  =  1,2,.., ’4/ N  ^  =  7.  The  regions  of  atljx^  values A  where  the 
various  pieces  of  these  VN  2^XN-2=1  ^  *N-1  6  costs-to-go  are 

valid  are  labelled  between  the  arrows  in  this  figure,  and  candidate 
cost  functions  that  are  ineligible  (due  to  Proposition  5.2)  at  the 
start  of  the  (j=l,  k*N-2)  alogorithm  iteration  are  x'd  out.  Thus  at 
the  start  of  the  left-to-right  sweep  the  eligible,  valid  candidate 
cost-to-go  functions  for  VN_2  <*N_2 >rN_2=1>  are  as  listed  in  table  7.1. 


We  now  resume  our  description  of  the  solution  algorithm  with  a 
detailed  description  of  the  left-to-right  sweep.  The  list  of  initially 
eligible  candidate  cost  functions  is  determined  in  block  15,  and  is 
updated  in  block  27  as  each  new  piece  of  V^(x^,r^  *  j)  is  determined. 

In  the  leftmost  interval  of  a(j)x^  values  (that  is,  at  the  start 
of  the  left-to-right  sweep) ,  the  valid  and  eligible  candidate  cost-to- 
go  functions  are 


•  for  a(j)  >  0  : 

Vk*'U(xk,j)  and  those 

t  ,L  i 

(x^,  j)  t  *  2,...,  that  are  eligible 

according  to  Proposition  5.2  (block  15) 
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In  this  example,  a(l) 


1. 


Eligible  from 
Proposition 
5.2  ? 


Candidate  cost-to-go  Region  of  a(l) 

„ _ _ N' 
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Propositi 
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no 
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no 

v3'L 

9n_2(3)) 

no 

V3'U 
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v3'R 
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no 

v4'L 

(->,  9n_2(4)) 
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v4'u 

(9N-2(4)  '  0M-2(4)) 
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v4'R 

(0n.2(4)  «) 

yes 

V5'L 

V2(5)) 

no 
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cn 

G 

(9N-2(5)'  0N-2(5)) 

yes 

v5'R 

(0n_2(5),  «) 

no 

6  ,L 

V  ' 

9n.2(6)) 

no 

V6'U 

(9N-2(6)'  0N-2(6)) 

yes 

V6'R 

fV2(61'  w) 
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v7'L 
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Table 

7.1:  Candidate 

cost-to-go  Functions  for  VN_2  ^in¬ 

-2,rN-2=1); 

Regions  of 

validity  and  Eligibility  due  to 

Proposition  5. 

and  those 


V. 


,0 


‘v3’ 


<V31 


1 . Vl"1 


that  are 


eligible  according  to  Proposition  5.2. 

From  Proposition  6.1  we  know  that  for  sufficiently  negative  values 
of  a(j)x^,  the  optimal  candidate  cost  is 

•  the  left  end  piece  V^e  (x^j)  =  V*'0  (xk#j)  if  a(j)  >  0 


se  K+l  'U 

•  the  right  end  piece  (x^,  j)=  Vfc 


(xk,j)  if 


a(j)  <  0. 


This  provides  starting  values  for  the  algorithm.  In  blocks  17,  18 
and  19  the  first  piece  cost  parameters  Kk(l:j),  H^dij)  and  G.d:j) 
are  assigned  the  appropriate  endpiece  values,  and  the  appropriate 
current  list  of  valid  eligible  candidates  is  designated.  In  block  20 
the  piece  counter  (m)  is  set  to  one  and  the  control  law  parameters 
for  the  first  piece  are  assigned.  The  rightward  search  (along  the 
axis  of  a(j)x^  values)  for  the  pieces  of  the  optimal  cost  function 
Vfc(*k'rk  *  j)  begins  in  this  leftmost  interval  of  the  0-0  partition, 
as  indicated  in  figure  7.5. 


Either  the  prevailing  optimal  cost  (the  endpiece  function)  is 


optimal  over  this  entire  first  interval  of  aC})^  values  or  it  is 

crossed  by  one  of  the  other  valid  eligible  candidates. The  intersections 

of  the  prevailing  optimal  cost  (at  the  start  of  the  9-0  partition 

interval)  with  all  other  valid  eligible  candidate  cost  functions  (in 

this  9-0  interval)  are  computed  in  block  21.  We  then  test  (in  block 

22)  to  see  if  any  of  these  intersections  are  inside  the  interval  of 

a(j)x^  values  that  is  under  consideration.  If  the  answer  to  this 

question  (block  22)  is  "no",  then  we  know  that  the  prevailing  optimal 

cost  is  optimal  over  the  remainder  of  this  9-0  partition  interval,  and 

we  proceed  to  the  next  interval  (to  the  right) . 

If  instead  the  answer  in  block  22  is  "yes"  then  the  leftmost  of 

these  intersections  determines  the  next  joining  point  6^ (m)  of 

Vk^Xk'  rk  *  ^ '  as  lndlcated  in  block  23. 

The  assignments  of  values  for  the  next  optimal  piece  of 

V  (x  ,r  »  j)  and  u.  (x^  ,r,  =>  j)  in  this  "yes"  case  are  made  in 

k  k  k  k  k  x 

blocks  24,  25  and  26.  If  only  one  of  the  candidate  cost  functions 

crosses  the  prevailing  optimal  cost  function  at  <$3  (m)  then  this  cost 

k 

becomes  the  new  prevailing  optimal.  If  two  or  more  of  the 

candidate  costs  intersect  the  prevailing  optimal  cost''"  at  6^  (m)  then 

k 

we  take  as  the  next  piece  of  vjt(xjt/rk  *  3)  the  intersecting  candidate 
cost  which  corresponds  to  driving  x 


•  the  furthest  to  the  right  if  a(j)  >  0 


•  the  furthest  to  the  left  if  a(j)  <  0. 

This  choice  is  made  because  we  know  from  the  monotonicity  of  the 


An  unlikely  but  possible  occurrence 


optimal  x  _ $x  (x  ,r  =  j)  mapping  (in  Proposition  5.3)  that  the 

K  i  Jv  )C 

other  costs  intersecting  at  6;J (m)  can  be  optimal  only  at  this  single 
intersecting  point. 

We  can  use  this  monotonicity  (in  block  27)  to  remove  from  further 
consideration  during  the  remainder  of  this  leftward  sweep'1'  all  candidate 
costs  that  drive  xjc+1 


or 


to  the  left  (for  a(j)  >  0) 


•  to  the  right  (for  a(j)  <  0) 


of  where  the  new  prevailing  optimal  cost  does.  In  particular,  the 

candidate  cost  that  ceased  to  be  optimal  at  5^(m)  cannot  be  optimal 

k 

again  (as  we  move  rightward  along  the  a(j)x  line). 

k 

This  process  is  continued  until  V  (x  ,r  *  j)  has  been  determined 

k  k  k 

over  the  entire  interval  in  the  0-0  partition  (that  is,  until 

the  answer  to  block  22' s  question  is  “no"). 

Then  the  next  interval  in  the  0-0  partition  of  a(j)x  values 

k 

2 

(to  the  right)  is  considered.  Because  we  have  moved  past  one  of  the 

{0-*(t),  ©^(t-l)}  values  in  entering  this  next  interval,  the  set  of 

k  k 

valid  candidate  cost  functions  changes: 

•  if  we  have  moved  past  a  0,3  (t)  value  then  Vt,L(x  ,j)  ceases 

k  k  k 

to  be  a  valid  cost?  it  is  replaced  by  V  ,U(x  *j)  in  the 

k  k 

list  of  valid  costs 


For  a  specific  value  of  j  at  time  k. 


2 


Or  more  than  one  if  they  have  the  same  value. 


•  if  we  have  moved  past  a  0^(t)  values  then  V^'  (x^j) 


ceases  to  be  a  valid  cost;  it  is  replaced  by 
t  R 

V.  '  (x.  ,  j)  in  the  list  of  valid  costs. 

This  updating  is  done  in  block  30.  These  replacements  may  or 
may  not  be  eligible  candidates  for  optimality  with  respect  to  the 
criterion  of  Proposition  5.2  (block  15)  and  the  monotonicity  property  of 
the  x^  *k+i ,rk  =  ^  mapping  of  Proposition  5.3. 

Once  the  new  list  of  valid  costs  that  are  eligible  for  optimality 
has  been  determined,  the  algorithm  must  check  to  see  if  the  prevailing 
optimal  cost  is  still  valid  (block  31) .  If  it  is,  then  the  procedure 
described  above  is  carried  out  for  this  new  interval  in  the  9-0 
proposition  (starting  in  block  21) . 

If  the  prevailing  optimal  cost  ceases  to  be  valid  in  the  new 
9-0  interval  (that  is,  the  answer  in  block  31  is  "no")  then  the 

rk 

corresponds  to  this  9-0  interval  boundary  (as  in  block  32) .  in 
this  situation  the  replacement  in  block  29  of  the  former  (now 
invalid)  prevailing  cost  becomes  the  new  prevailing  cost  if_  it  is 
eligible  ,  If  this  replacement  cost  is  not  eligible,  then  at  least 
one  of  the  other  newly  valid  costs  will  be  eligible.^-  The  newly 
valid  eligible  cost  that  corresponds  to  driving  x^+1  farthest  to  the 
right  (for  a(j)  >  0;  to  the  left  for  a(j)  <  0)  is  the  new  prevailing 
optimal  cost  (see  block  33) . 

^"Either  the  replacement  of  the  now  not  valid  former  prevailing  cost  is 
eligible  for  optimality  (w.r.to  Propositions  5.2  5.3)  or  another 
eligible  valid  cost  must  intersect  the  former  prevailing  optimal  cost 
at  the  left  boundary  of  the  new  a (j)x^ interval,  since  V^Cx^r^j) 
must  be  continous  in  x.  (by  Proposition  5.1). 


next  joining  point,  6^  (m)  ,  of  V,  (x  :  ,r  =  j)  and  u  (x,  , 

JC  )C  K  JC  K  K 


The  algorithm  proceeds  through  each  interval  in  the  9-0  partition 
until  the  last  partition  interval  has  been  completed1  (block  28) .  Then 
if  a(j)  <0,  the  indexing  of  the  solution  parameters  is  reversed 
(in  block  34).  This  completes  the  one-stage  solution  (as  in  Prop.  5.1, 
and  block  3)  for  time  stage  k  in  form  j. 

Example  7.1  (=  5. 1,6.1)  continued 

Let  us  now  illustrate  the  algorithm  for  the  k  =  N-2,  j  =  1 
iteration  of  example  5.1.  Since  a(l)  =  1  >0  (in  block  17),  the  left-to- 
right  sweep  of  the  algorithm  begins  with  the  left-endpiece  cost  function 

VN-2  (1)  =  vl,U  initially  prevailing,  and  the  first  9-0  interval  to  be 
considered  is  (r08,  6N_2(2)).  T^e  va^ues  of  the  first  piece  of 

VN-2  (XW-2'1}  311(1  UN-2(XN-2,1J  aire  assi9«ed  as  specified  by  blocks  18 
and  20 .  We  list  below  these  assignments  and  all  successive  ones  at 
the  left-to-right  sweep  progresses. 


1.  Searching  interval  (-®,9  2(2)  =  “31.22) 

Eligible  valid  candidates:  V1,U,  V4,L 


V1,U  initially  prevailing  since  a(l)  >  0  block  17 


KN-2(l!l) 

V-  2 

block 

18 

•H 

•S— ' 

CM 

1 

=  gn_2(1:1>  -  0 

m  =  1 

block 

19 

W1:1) 

-  b(D  KN_2(l:l)/a(l)R(l) 

block 

19 

FN-2(1s1) 

=  0 

If  the  problem  is  completely  symmetric  about  zero,  then  the  sweep  can  be 
halted  when  the  middle  piece  cost  function  becomes  optimal. 


•  prevailing  cost  function  V  '  does  not  intersect  .  blocks  21,22 
the  (only)  eligible  valid  candidate  v4'L  before 

V2(1> 


move  rightwards  to  (6.,  .(2),  ©  _  (1) ) 

N-2  N-2 


block  29 


Searching  interval  (-31.22  =  9  (2),  _(1)  *  -30.98) 

N-2  N-2 


Eligible  valid  candidates:  V^'U,  V2,U,  V4,L 
2,U  2,L 

(V  replaces  V  as  valid  cost  since  we  have 


block  30 


passed  9„  _(2)  and  it  is  eligible) 


•  Prevailing  cost  V  '  is  still  valid 


block  31 


•  Prevailing  cost  function  V^'U  intersects  V2,U  at  block  21 

x  =  -31.179,  -7.847;  intersections  of  V1,U  and 

N-2 

4  ,L 

V  were  computed  above 

•  Since  " 


-31.179  is  inside  the  search  interval 


block  22 


we  have 


<SM  _(1)  =  -31.179 

N— 2 


block  23 


is  the  new  prevailing  cost 


block  24 


Thus  m 


block  25 


KN-2(2:1) 


-  v2'U 

■  n-2 


block  26 


HN-2(2:1) 


GN-2(2:1) 


LN-2(2:1) 


FN-2(2:1) 


336 


_  „1,U  and  V  '  from  future 

Remove  V 

eligibility  (due  to  Proposition  5.3) 

Searching  interval  (-31.179  =  6  _(1),  _(1)  =  -30.98) 

N— 2  N— 2 

Eligible  valid  candidates  remaining:  V2,U,  V4,L 

2.U 

Prevailing  cost  V  does  not  intersect  the  (only) 

4  L 

eligible  valid  candidate  V  '  inside  the  search 
interval 

move  rightwards  to  (0  (1) ,  0  (2)  =  0  (3) )  * 

N-2  N-2  N-2 

=  (-30.98,  -12.54) 

Searching  interval  (0  .(1),  0„  .(2)  =  8  _(3))  =  (-30.98 

N-2  N-2  N-2 

Eligible  valid  candidates:  V2,U,  V4'L 

(V1,R  replaces  V1,U  in  list  of  valid 

candidate  costs  since  we  have  passed 

0„  _(1),  but  v1,R  is  not  eligible). 

N— 2 

2  ,U 

•  prevailing  cost  V  is  still  valid 

2 , U  4,L 

•  intersections  of  V  and  V  are 
known  from  above  and  they  are  not  in 
this  interval 

•  move  rightwards  to  (6  (3)  =  0  -(2),  0  (3))  = 

N“2  N— 2  N-2 

»  (-12.54,  -4.56) 

Searching  interval  (0M  _(3)  =  0  _(2),  ©  _(3))  =(-12.54 

N-2  N— 2  N-2 

3. U  4.L 

Eligible  candidates:  V  ,  V 

(V2'R  replaces  V2,U  and  V3,U  replaces 


block  27 


blocks 
21„  22 

block  29 

-12.54) 

block  30 

block  31 

blocks 

21,22 

block  29 

-4.56) 


block  30 


block  31 


V  since  we  have  passed  _(3)  =  Q„  _(2)), 

N-2  N-2 

3  U  2  R 

V  '  is  eligible  but  v  '  is  not,  by 

Proposition  5.2). 


2  U 

•  The  prevailing  cost  V  '  is  no  longer  valid. 

•  <W2)  =  9N-2(3)  (=0N-2(2)  =  ‘12-54) 


block  32 


•  since 


the  old  prevailing  cost  is  replaced  by 


block  33 


V  '  which  is  not  eligible. 


V3,U  is  the  new  prevailing  cost. 


•  Thus 


m  =  3 


block  25 


V2(3:1)  -  4-1 
\.2(3s1)  "  ° 
<W3:1)  =  ° 

LN-2(3;1)  =  4-2 


block  26 


•  „2,U 


Fm  ,(3:1)  =  0 

N-<d 

is  removed  from  future  eligibility 


block  27 


•  The  intersection  of  the  new  prevailing  cost 


block  21 


V  '  and  the  (only  other)  eligible  valid  cost 
V4'L  are  at 


XN_2  =  -6.977,  -2,1417 


•  Since  -6.977  j_s  inside  the  search  interval  we 


6„  .(3) 


-6.977 


block  23 


V  is  the  new  prevailing  cost 


block  24 


•  Thus 

m*4 

and 


v2(4:1) 

- 

f 

to 

h- 

* 

GN-2 (4:1) 

-  g4'l 

N-2 

W4{1) 

=  l4'l 
N-2 

FN-2<4j1) 

=  f4'l 

N-2 

3  U 

•  Remove  V  '  from  future  eligibility 

•  The  only  valid  eligible  candidate  cost  in  the 
remainder  of  this  search  interval  is  the  prevailing 
cost  V4'L 

•  Move  rightwards  to  search  the  interval 
(0N-2(3)'  9N-2(4)) 

Searching  interval  (0„  ( 3 ) , 0  (4))  =  (-4.56,  -3.277) 

N— i.  N— i 

4  ,L 

Eligible  valid  candidates:  V  only 

(V3'R  replaces  V3,U  since  we  have  passed  9N_2 (3) , 
but  V3'R  isn’t  eligible). 

4,L 

•  Since  only  V  is  valid  and  eligible. 

move  rightwards  to  the  interval  (0„  _(4),0). 


block  25 

block  26 


block  27 

blocks  21 
22 


block  29 

block  30 


blocks 

30,31,21, 

22,29 


block  30 


Searching  interval  (9  _(4),0)  =  (-3.277,  0) 

N-2 

4  u 

Eligible  valid  candiatesr  V  ' 

(V4'U  replaces  V4,L  as  a  valid  candidate 
since  we  have  passed  >  an<*  V4,L  is 


eligible) 


4  ,L , 


•  The  prevailing  cost  V  is  no  longer  valid. 

•  W4)  ■  W4»  -  -3-277 

4  U 

•  Since  V  '  is  eligible  it  is  the  new  prevailing 
cost 

•  Thus 

m  =  5 

and 


KN-2(5:1) 

-  k.4'u 

V-2 

HN-2(5:1) 

=  GN-2 

LN-2(5:1) 

-  L4'1 

V; 

V2(5;L) 

=  o 

•  Remove  V4,L  from  future  eligibility 

,4.U 


•  V 


is  the  only  eligible  valid  cost 


•  We  are  in  the  rightmost  partition  since 
this  example  is  a  symmetric  problem 


block  31 
block  32 

block  33 

block  25 

block  26 


block  27 

blocks  21 
22 

block  28 


(see  block  4) 


8.  This  completes  the  left-to-right  search  for 


j=l,  k=N-2 .  The  optimal  cost-to-go 

Vz'Vi-Vi  ’  l)  and  c°ntto1  la”  V2UK-2'rll-2  -  11 

parameters  have  been  determined  for  ^  <  0  and  can 
be  obtained  for  x^  ^  >  0  directly  by  symmetry.  □ 

The  algorithm  presented  in  this  section  computes  ,  off-line,  the 
optimal  control  laws  and  expected  cost-to-go  parameters  in  each  form 
jef4  and  at  each  time,  for  the  general  class  of  finite  time  horizon  JLQ 
problems  formulated  in  chapter  5.  This  algorithm  can  also  be  used  to 
obtain  approximations  of  the  optimal  steady-state  solutions  of  infinite 
time  horizon  problems  (provided  such  steady-state  solutions  exist)  as 
we  will  see  in  Section  7.7. 

7 . 3  Qua.1  itative  Behavior  of  the  Optimal  Controller  in  the  Single 

A 

Form-Transition  Problem: _ Yn-1  ^XN-1^  rN-2  ~  ^  Shapes^ 

Using  the  algorithm  that  was  described  in  the  previous  section  we 
can  compute  the  optimal  controller  for  any  JLQ  problem  (of  chapter  5) 
with  form  transition  probabilities  that  are  piecewise  constant  in  x. 

The  remainder  of  this  chapter  contains  a  further  examination  of  the 
qualitative  properties  of  these  controllers. 

The  class  of  control  problems  that  is  solvable  using  the  algorithm 
of  section  7.2  is  extremely  rich.  A  wide  range  of  optimal  controllers 
exhibiting  myriad  possible  qualitative  behaviors  can  be  obtained, 
depending  upon  the  choice  of  problem  parameters.  Some  of  these 
controllers  are  relevant  to  fault-tolerant  control  applications  and 
some  are  not.  Consequently,  it  is  impossible  to  make  further  meaningful 


qualitative  statements  about  the  piecewise  constant-in-x  JLQ  control 
problem  in  general  -  there  are  just  too  many  parametric  cases  to 
account  for. 

We  have  chosen  therefore  to  focus  our  attention  in  the  remainder  of 
this  chapter  on  subclasses  of  JLQ  problems  that  enable  us  to  gain 
insight  into  the  kinds  of  parametrically  determined  qualitative  behaviors 
that  are  appropriate  to  fault-tolerant  controllers.  In  particular  we 
will  study  in  further  detail  the  single  form-transition  problem  of 
section  6-5  (i.e.:(6.88)  -  (6.93).  This  problem  is  a  useful  tool  for 
study  because  the  iterative  algorithm  procedures  of  section  7.2  can  be 
described  by  recursive  difference  equations  that  are  amenable  to 
detailed  analysis1,  but  this  problem  is  still  sufficiently  general  to 
expose  the  tradeoffs  between  controller  performance  and  reliability 
goals  that  are  the  essence  of  fault-tolerant  control. 


In  this  section  we  will  examine  the  shape  of  the  conditional 

A 

expected  cost-to-go  VN_]_  I  rN-2-1^  ^or  ent^re  class  of  single 

form  transition  problems.  The  purpose  of  this  is  to  demonstrate 

A 

the  tremendous  diversity  of  V  ^  shapes  (and  hence  the  broad  range 
of  optimal  controllers)  that  can  arise  at  k  =  N-2  in  this  problem, 
leading  to  the  wide  variety  of  controllers  as  (N-k)  increases.  In  later 
sections  we  will  examine  certain  subclasses  of  the  single  form- 
transition  problem  that  possess  special  structures  that  facilitate 


analysis  in  greater  detail. 


Recall  that  the  single  form-transition  problem  is  as  follows 


In  sections  6.5  and  6.6  we  solved  for  the  last  stage  solution 


V„  , (x„  , ,r  =1)  for  all  parametric  cases  of  (7.2)  -  (7.5).  We  did 
N-l  N-l  N-l 

A 

this  by  first  examining  the  conditional  expected  cost-to-go  V  (x  I r  =1) 

N  N  N-l 

Recall  that  there  are  two  parametric  cases  of  interest: 


Case  1:  K^(2)  >  K^d)  =  K^O) 

AAA 

Case  2:  5^(2)  >  K^(l)  =  1^(3) 


Case  1  problems  occur  when 


(7.6) 

(7.7) 


(0>2-u>1)  [Kt(2)  +  Q (2)  -  (Kt(1)  +  Q(l))]  >  0  • 


(7.8) 


This  situation  arises  when  the  system  performance  goals  and  reliability 

goals  are  commensurate.  For  example,  suppose  that 

•  w  >  w  the  probability  of  the  form  change 

2.  1 

is  greater  away  from  zero  than 
near  it 


•  (2)  +  Q(2)  >  Kt(1)  +  Q ( 1)  the  cost  charged  at  time  N  is 

greater  in  form  2  than  in  form  1 


This  would  correspond  to  a  system  when  entry  into  form  2  represents 

the  occurrence  of  a  costly  failure,  with  the  probability  of  failure 

p(l,2:x)  increasing  away  from  the  regulator  goal  of  x=0.  The  performance 

2  2 

goal  (of  keeping  a  weighted  sum  of  x^  +  un-i  Sma^-D  is  met  by  making 

2  ,  .  2 
x.  small  without  using  too  much  control  (i.e.,  without  making  u  ,  too 
N  J  N-l 

large) .  This  performance  goal  is  consistent  with  keeping  x^ 
so  as  to  reduce  the  probability  of  failure. 


small 


m 


h  ■ 


t-;': 

t*.  j 


>• 

r*-J 


j» 


r  „  S'.  «  _ 


Case  1  problems  also  arise  when 

•  0)  >0) 

1  2 

•  Kt(2)  +  Q  (2)  <  Kt(1)  +  Q  (1) 

Here  the  transition  to  form  2  results  in  a  lower  cost  charged  at  time  N 

and  the  probability  of  making  this  desirable  transition  is  greater  near 

2 

zero  than  away  from  it.  So  again,  the  performance  goal  (keeping  ^ 

2 

small  with  small  u  _)  and  the  reliability  goal  (increasing  the  prob- 
N-l 

ability  of  the  favorable  transition)  are  commensurate. 


Case  1  problems  possess  a  conditional  expected  cost-to-go 

-*v 

V ,  =1)  at  time  k  »  N  like  that  of  figure  7.8va)  .  The 
N  N  N-l 

conditional  cost  is  discontinuous  at  x^  =  ±a  and  the  "good"  (low  cost) 
sides  of  these  discontinuities  are  the  sides  nearer  zero. 

Case  2  problems  occur  when 


(0J1  -W  2)  [K^ (2)  +  Q (2)  -  (K^d)  -  Q(l)  ]  >  0  (7.9) 

This  situation  arises  when  the  system  performance  goals  and  reliability 

goals  are  conflicting.  That  is,  the  best  strategy  to  reduce  the 

2  2 

instantaneous  cost  (a  weighted  sum  of  +  un_i^  at  cross” purposes 
with  reducing  the  probability  of  being  in  the  more  costly  form  at  time  N. 
Case  2  problems  occur  when  either 

to  >  (0 
1  2 


Kt(2)  +  Q (2)  >  Kt(1)  +  Q( 1) 


or 


w2  >  W1 


K^d)  +  Q (1)  >  Kt(2)  +  Q (2) 


They  have  a  conditional  expected  cost-to-go  V  (x„ ! r  =1)  like  that 

N  N '  N-l 

shown  in  figure  7.8(b).  This  conditional  cost  is  discontinuous  at 
=  -a  but  the  "good"  sides  of  the  discontinuities  are  away  from 
zero. 


As  shown  in  section  6.6,  there  are  three  possible  shapes  for  the 

optimal  expected  cost-to-go  vN_j_  ,rN-l=1^  *  These  are  repeated  here 

in  figure  7.9.  For  Case  1  problems,  VN_]_  (XN_1  ,rN-i=1^  has  a  sin<?le 

minimum  at  zero  (shown  in  figure  7.9(a)).  For  Case  2  problems 

V  ,  (xkT  ,r  =1)  can  have  two  additional  local  minima  at  x„  ,  =  ±a/a(l) 
N— 1  N— 1  N— 1  N— 1 

if  and  only  if 


b2d) 

R(l) 


V2)  ~  kn(1) 
V1*  V2) 


(7.10) 


Each  of  these  three  V.,  , (x„  , ,r  =1)  shapes  can  lead  to  several  different 

N— 1  N-l  N-i 

shapes  of  the  next  stage  conditional  expected  cost-to-go , 

/\ 

V  ,  (x.,  ,  |r  .=1)  ,  depending  upon  the  ordering  of  the  grid  points  in 
N-l  N-l  N— z 

the  composite  partition  of  x%,  .  (that  is,  depending  on  the  relative 

N-l 

values  of  ±a,  <5  (1)  ,  6  (2)  ,  5  (3)  and  6  (4))  . 

N-l  N-l  N-l  N-l 

Each  of  the  different  shapes  of  the  conditional  expected  cost-to-go 

A 

V„  . (x„  ,|r„  =1)  will  in  turn  result  in  one  (or  more)  qualitatively 

N-l  N-l  N— 2 

different  shapes  for  the  optimal  expected  cost-to-go  at  time  k  =  N-2, 
v  _ (x„  -,r„  .=1) .  This  diversity  of  possible  controllers  increases 
geometrically  as (N-k)  increases  even  for  the  relatively  simple  problem 
(7.2)  -  (7.5).  It  is  this  diversity  of  parametric  cases  that  makes  it 
difficult  to  make  further  descriptive  qualitative  statements  about  optimal 
JLQ  controllers  that  have  generality  (even  though  we  can  solve  for  the 
controller  in  each  specific  case  via  section  7.2) . 


3 


Shape  of  VN_ x ( XN_ x , jtn_ 1)  for  (a)  Case  1  as  in  (7.6) 

(b)  Case  2  as  in  (7.7)  with  (7.10)  not  holding; 

(c)  Case  2  as  in  (7.7)  with  (7.10)  holding. 


Let  us  now  consider  the  possible  shapes  of  the  next  stage 

/v 

conditional  expected  cost-to-go  V  _  (x„  . |r  =1)  .  Recall  that 

N-l  N-i  N-2 

/V 

Vlt  _  (x„  ,|rM  =1)  is  a  piecewise-quadratic  function  of  x%T 
N-l  N-l  N-2  N-l 


Vl(XN-llrN-2=1) 


Vi  *N-i(i)  +  Vi  Vi(i)  +  S-I(i) 


for  vi e  Vi(i) 


i  =  1,..., 


(7.11 


where  the  regions  ^(i)}  of  xN  ^  values  in  (7.21)  are  those 

specified  by  the  composite  partition  in  step  1  of  section  5.4.  This 
partition  (as  described  in  section  5.4)  is  constructed  by  super¬ 
imposing  the  grids  due  to  the  joining  points  (<$N_^(i)  si  =  1,2, 3, 4)}  of 

V  (x  ,r  =1)  i  and  the  form  transition  probability  discontinuities 
N-l  N-l'  N-l 

(Vi2  ( 1)  =  ”<*  ,  v^2  (2)  =  a}  .  The  number  of  regions  of  values 

induced  by  this  partition  is  =  except  for  the  degenerate  cases 

of  <5  ,  (1)  =  -a  or  <$„  ,  (2)  =  -a  .  There  are  three  different  non- 

N-l  N-l 

degenerate  situations  that  can  occur ,  as  listed  in  table  7.2. 


|Y„_n (1)  <  Y„_! (2)  <  Y„_i (3)  <  Y„_, (4)  <  ym_, (5)  <  (6) 


Table  7.2  Nondegenerate  Partitions  of  x  Values 


The  value  of  the  open  loop  dynamics  (a(i)  in  form  1  determines 
which  of  the  situations  in  table  7.2  applies.  From  the  equations  for  the 


<5^_^(i)  we  obtain  the  following  directly: 


For  case  1  problems 


A 

(where  (2) < 


Vl,s  V3> 


(7.12) 


R (1)  +  b  (1)  (2) 

R ( 1)  +  b2(l)  K^(l) 


(7.12) 


with  (2)  in  table  7.2  corresponding  to  case  1  problems  with  a(l) 
between  the  two  values  in  (7.12)  -  (7.13)  . 


AAA 

For  case  2  problems  (where  *^(2)  >1^(1)  =  KN^3) 


(7.14) 

2 

(3)£s4  a(l)>  1  +  — -(1)  K  (1)  (7.15) 

R(l) 

with  (2)  in  table  7.2  corresponding  to  case  2  problems  with  a(l) 
between  the  two  values  in  (7.14)  -  (7.15). 


From  the  above  we  see  that  the  partition  ordering  (in  table  7.2)  is  a 
stability-related  property  by  the  form  1  open  loop  dynamics : 


•  for  case  1  problems,  all  open  loop  stable  systems  in  form  1 
(i.e.  a(l)  <  1)  satisfy  (1)  in  table  7.2  (as  do  some  not-too 
unstable  open  loop  systems).  Only  unstable  a(l)  can  yield 
situations  (2)  and  (3) . 

•  for  case  2  problems,  only  open  loop  unstable  systems  in  form  1 
satisfy  (3)  in  table  7.2.  If  there  are  local  minima  for 
VN-l^XN-l'rN  i=^ '  wkich  occurs  if  and  only  if  (7.10)  holds, 
then  all  open  loop  stable  (a(l)  <  1)  systems  satisfy  situation 

(1)  in  table  7.2.  Since  the  right-hand  side  of  (7.14)  is  less 

one  if,  and  only  if  (7.10),  holds,  we  have  that  if  there  are 

no  local  minima  in  V  ,  (x  ,  ,r._  =1)  (except  for  the  global 
-  N-l  N— 1  N-l 

minimum  at  x„  =0)  then  only  unstable  a(l)  yield  situation 
N— 1 

(2)  in  table  7.2. 

We  have  now  characterized  the  different  x^^  ^  partitions  that  can  arise  in 

step  1  of  section  5.4  (and  block  4  of  the  algorithm  of  section  7.2)  at 

time  k  =  N-l.  In  section  6.6  we  saw  now  how  the  shape  of  the  conditional 

expected  cost-to-go  rN-l=^  was  ^irectiy  related  to  the  qualitative 

properties  VN_1(xN_1,rN_1=l)  .  Similarly  the  shape  of  V-i  (*N-1  I  rN-2=1) 

is  intimately  tied  to  the  qualitative  properties  of  the  next  stage 

solution,  V„  „ (x„  _,r_.  =1) .  The  number  of  qualitatively  different 

N— 2  N-Z  N-2 

A 

shapes  of  V„  , (x„  . |r„  =1)  that  can  arise  for  the  single  form-transition 

N—l  N—l  N—2 

problem  (and  lead  to  significantly  different  optimal  controller  properties) 
is  large.  To  demonstrate  this  we  illustrate  in  figures  7.10  -  7.12  the 
six  different  basic  shapes  that  rN_2=l)  can  take  for  case  1 


iroblems  (even  more  shapes  arise  for  case  2  problems) .  Depending  upon 


the  relative  values  of  the  problem  parameters  ^ + 

Q  ( 1)  :  t=l, _ _  mjj_i  (5L)  and  (1:2)  +  Q(2),  the  conditional  expected 

A 

cost-to-go  V,  ,  (x„  . Ir  =1)  can  be  monotone  nonincreasing  for  x  <  0, 
N— X  N-l  N*“2  N“1 

or  not,  for  each  of  the  composite  partition  situations  (1)  -  (3)  listed 
in  table  7.2. 

In  this  section  we  have  indicated  how,  in  only  two  time  steps,  the 
number  of  parametrically  determined,  qualitatively  different  optimal 
JLQ  controllers  becomes  large  for  the  relatively  simple  single  form- 
transition  problem.  In  the  remainder  of  this  chapter  we  will  obtain 
parametric  conditions  on  (7.2)  -  (7.5)  that  specify  particular  sub¬ 
classes  of  the  JLQ  problem  which  possess  special  structures  in  the 
composite  partition  of  x^  (at  each  time  k)  and  thus  in  the  optimal 
controller  as  well. 


The  existence  of  local  minima  makes  case  2  problems  more  complicated. 


(b) 


Figure  7.10:  Possible  VN_! (XN_X <rN_2=1)  shaPes  for  Case  1  in 
Situation  1  of  Table  7.2.  In  (b)  we  can  have 

y 1 <y 2 <y 3 '  y2<yl<y3  °r  y 2<y 3 <y 1  • 
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7.4  Qualitative  Behavior  of  the  Optimal  Controller  in  the  Single 
Form-Transition  Problem:  Bounds,  Endpieces  and  Middlepiece 


In  this  section  we  identify  certain  conditions  on  the  parameters 
of  the  single  form-transition  problem  (7.2)  -  (7.5)  which  yield 
example  problems  that  have  a  fairly  simple  structure  (so  as  to  be 
amenable  to  detailed  analysis)  but  are  still  sufficiently  general  to 
expose  the  tradeoffs  between  the  reliability  and  performance  goals  of 
fault-tolerance  controllers. 


In  chapter  6  we  established  certain  facts  about  the  optimal  expected 
cost-to-go  (x^,r^=l) ,  as  k  decreases  from  N.  These  included  the 


following: 


(1)  The  endpieces  of  V^x^r^-l)  ,  valid  for  extremely  negative 

and  positive  values  of  ,  are  described  by  the  same  finite 
positive  quadratic  cost  function  (x^  ^)  -  (x^  ^) 

of  x^,  given  by(6.99)  -  (6.104).  They  cure  the  same 
functions  due  to  the  symmetry  (about  0)  of  the  problem. 

The  endpiece  cost  function  converges  as  (N-k)  increases, 
to  the  steady-state  endpiece  cost  function  specified  by 

K*e(l)  =  K^*(l)  in  (6.106). 

00  00 

(2)  The  switching  region  (between  the  endpieces)  has 
finite  width  at  each  time  k,  but  as  (N-k)  -*■  <®  this 
width  grows  without  bound  (Proposition  6.3). 


(3)  The  expected  cost-to-go  V^tx^r  =>  1)  has  a  single  middle- 

piece  cost  function,  given  by  (6.99),  (6.102),  (6.105). 

This  finite  positive  quadratic  function  of  x^  converges  to 

the  steady-state  middlepiece  cost  function  specified  by 
RM  LM 

Koo  (1>  E  f»(l)  in  (6.107). 

(4)  =  ^  lies  between  the  upper  bound  function 

UB  LB 

Vk  3111(1  lower  bound  function  (x^,l)  that  are 

specified  in  Proposition  6.7,  at  each  value  of  x^.  These 

bounds  are  quadratic  (not  piecewise  quadratic)  functions 

°f  As  (N-k)  -*•  <»  ,  these  bounds  converge  to  the  steady- 

state  bounds  vf(x,l)  and  v£B(x,l)  given  in  (6.86) -(6.87)  . 

Now  what  else  cam  we  say,  in  general,  about  the  qualitative  properties 
of  the  optimal  controller  for  the  single  form-transition  problem? 


Let  us  restrict  our  attention  to  single  form-transition  problems 
(7.2)  -  (7.5)  where,  at  each  time  k  ,  the  sum  of  the  x^  cost  and 
the  expected  cost-to-go  from  x^  is  higher  in  form  r^  *  2  them  it  is 
in  form  r^  »  1.  This  situation  is  what  we  expect  to  occur  when 
r  »  2  denotes  operation  in  a  failed  or  degraded  mode.  Thus  we  are 
focusing  here  on  single  form-transition  JLQ  problems  that  are 
appropriate  representations  for  fault-tolerant  control  applications. 
The  following  conditions  ensure  this  situation: 


Since  a(l)  >0,  by  Proposition  6.7. 


Fact  7.1:  For  the  single  form-transition  JLQ  control  problem  of 


(7.2)  -  (7.5)  if 


Q (1)  <  Q (2) 
K^d)  <  K>p(2) 


{same  or  greater  x-cost 

charged  in  form  2  than  in  form  1 


(7.16) 
(7. 17 i 


0  <  a(l)  £  a(2)  form  2  not  more  stable  than  form  1  (7.18) 

<_  — D?  LH /R(1)J  ratio  of  control  effectiveness 

a2  (2)  [b  (2)/R(2)  ]  _ 

in  form  1  to  form  2  is  greater  than 

(or  equal  to)  square  of  ratio  of 

open  loop  dynamics  (7.19) 


then  at  each  time  k  =  N,  N-l,  N-2,...,kQ+l  we  have 


Q ( 1) 


vkB(xk'V1}) 


(7.20) 

Proof;  Conditions  (7.16)-  (7 . 17)  guarantee  that  (7.20)  holds  at  time 
k  *  N.  The  additional  conditions  (7.18)-  (7.19)  and  direct  substitution 

TTQ 

for  (x^r^-1)  and  V]c(xk,tk=2)  from  (6.74),  (6.86),  (6.78)  -  (6.79) 
and  (6.94),  (6.96)  yield  an  inductive  proof  of  (7.20)  as  k  decreases.  O 

Thus,  when  (7.16) -(7.19)  hold  then  the  x-cost  and  cost-to-go  in 
the  failure  mode  r  =  2  are  greater  than  (or  equal  to)  that  in  the  normal 
form  r  «  1,  at  every  time  k.  Condition  (7.16)  and  (7.17)  are  obviously 
appropriate  if  form  r*2  is  a  failure  mode  of  operation,  and  (7.18)  is  not 


unreasonable.  Condition  (7.19)  is  not  excessively  restrictive.  It 


says  that  for  the  problem  (7.2)  -  (7.5)  to  have  structure  (7.20)  we 
must  have  : 

2  2 

•  open  loop  squared  dynamics  ratio  a  (l)/a  (2)  small  enough 
(failure  Mode  not  too  stable  relative  to  normal  operation) 

•  energy  cost  ratio  R(2)/R(l)  large  enough  (cost  of  energy  in 
failure  Mode  not  too  low  -  in  normal  made,- not  too  high) 

2  2 

•  actuator  gain  squared  ratio  b  (l)/b  (2)  large  enough 
(actuator  gain  large  enough  in  normal  operation  and  not  too 
large  after  failure) . 

Note  by  (7.18)  that  (7.20)  is  satisfied  if 

b2(l)  >  b2  (2)  (, 

R(l)  R(2) 

We  will  now  further  restrict  our  attention  to  problems  where  the 

functions  Vk(xk,rk»2)  ,vJM(xk,rk=l)  S  v^x^r^l)  and  V^fx^r^l)  = 

R 

e 

Vk  (x^r^^l)  all  increase  monotonically  as  (N-K)  increases.  This 

restriction  (which  is  made  for  analytical  convenience),  is  characterized 
by  the  following: 

Fact  7.2:  For  the  single  form- transit ion  JLQ  control  problem  of 
(7.2)  -  (7.5),  with  conditions  (7.16)  -  (7.19)  of  Fact  7.1  holding, 

(1:2)  (hence  Vk 

as  (N-K)  increases  if  and  only  if 


(xk,rk=2)  for  each  xk)  increase  monotonically 


°<kt(2) c 


R(2)  [a  (2)  -1)) 


-  b  (2)  Q  (2) 


R(2)  [a  (2)  -  1J 


-b  (2)  Q (2) 


2b2 (2) 


+4b2(2)a2(2)R(2)  Q (2) 


(7.22) 


2)  If,  in  addition  to  (7.22)  condition  (7.23)  below  is  true  for 
i  =  1,  then  the  middlepiece  parameter  K^M  (1)  =  K.  ^(1)  (and 
hence  v^M(xk,l)  for  each  x^)  increases  monotonically  as  N- k 
increases. 


3)  Iff  in  addition  to  (7.22)  condition  (7.23)  below  is  true  for 
i  =»  2,  then  the  endpiece  parameters  K^e(l)  ~  K^d)  (and  hence 
v£e(xk,l)  and  v£e  (xk^i)  for  each  x^)  increase  monotonically 
as  N-k)  increases : 

0  £1^(1) <  - 


■{R(l)  (l-a2(l)  (1-ok))  +  b2(l)  [Q(l)  +<^(1^(2)  +  Q(2)  -Q(l)))} 


+  /{r(1)  (l-a2(l)  (l-UJi) )  +  b2(l)[Q(l)  +oji(KT(2)  +Q(2)  -  Q  (1)  )  ]  > 
V  2  2 

.  *  +  4a  (1)  R(l)  b  (1)  (1-0)..)  [Q(l)  +  Wi(K<r(2)  +  Q(2)  -  Q  (1) )  ] 


2b  (1)  (l-oo.) 


'".23 


Condition  (1)  says  that  K^(l:2)  decreases  as  (N-k)  increases 
if  the  terminal  cost  1^(2)  is  not  too  large.  Conditions  (2)  and 
(3)  require  that  terminal  cost  KT(1)  is  not  too  large.  In  particular, 
Fact  7.2  holds  if  K^(l)  -1^(2)  =0. 

The  proof  of  Fact  7.2  is  by  induction.  It  involves  straightforward 
but  tedious  algebraic  manipulations  that  are  detailed  in  Appendix  C.ll. 

The  parameter  restrictions  of  Facts  7.1  and  7.2  also  guarantee 
the  following  strong  relationship  between  the  functions  (of  x^)  that 
describe  the  middlepiece,  endpiece  and  bounds  of  V  (x  ,r  =1) . 

K  K  K 

Fact  7.3:  For  the  single  form- transit ion  JLQ  control  problem 

(7.2)  -  (7.5)  with  parameter  values  satisfying  conditions 
of  Fact  7.1  we  have  the  following: 

(1)  If 

ui  (7.24 

(that  is,  the  "failure  probability"  p(l,2:x)  is 
higher  away  from  x  ■  0  than  near  it)  then  at  all  times 
k  *  N,  N-l,  N-2 , . . . 

•  The  endpieces  V^e  (1),  V^e(l)  and  upper  bound 

,r^  *=  1)  are  given  by  the  same 

t  a  Da  [TO 

function  of  x^  (that  is,  (1)  *  (1)  ■  (1) 

for  all  k) 


of 


the  middlepiece 

of  vv 
of  x^  (that  is, 


and  V^d)  =  R^tl)  and  lower  bound 
k  k - 

=  1)  are  given  by  the  same  function 

K^U)  1^(1)  =  kJB(1)  for  a11  k)- 


(2)  If 


(that  is,  the  "failure  probability"  p(l,2:x)  is  higher 
near  x  =  0  than  away  from  it)  then  at  all  times 
k  •»  N,  N-l ,  N-2, .  -  - 


and 


T  e  pg 

the  endpiece s  VT  (1),  V  (1)  and  lower  bound 

Jv 


vf<1>  of  Vk(xk,rk 


1)  are  given  by  the  same 


function  of  x^  (that  is,  K^e(l)  =  K^e(l)  =  k£B(1) 


the  middlepiece  VkM(1>  *  V*  (1)  and  upper  bound 
UB 

(1)  of  Vk(x^,r^  =  1)  are  given  by  the  same 

LM  RM  UB 

function  of  x^  (that  is,  (1)  =  (1)  =  K  (1) 


The  proof  of  this  fact  is  a  straightforward  induction,  given  in 


Appendix  C  12. 


The  two  cases  of  p(l,2:x)  in  (7.24)  -  (7.25)  are  shown  in 

=  1)  that  result  from  these 
(at  any  k)  sure  illustrated  in  figure  7.12.  Note  that  (7.11)  and  figure 
7.13(a)  correspond  to  problems  where  the  twin  goals  of  system  performance  and 
reliability  are  commensurate;  driving  x  toward  zero  reduces  the 
operating  cost  at  k  and  keeps  the  probability  of  failure  small.  Thus 
far  from  zero  (x^r^l)  is  coincident  with  its  upper  bound,  and  near 
x  =  0  the  lower  bound  function  is  reached  since  both  goals  are  being  met. 
Figure  7.13(b)  and  (7.25)  pertain  to  fault-tolerant  =  .oontrol  problems 
where  these  goals  are  contradictory;  driving  x  towards  zero  (inside  of 
(-a, a)  reduces  the  operating  cost  but  increases  the  probability  of  failure. 
Therefore,  near  zero  V^x^r^l)  is  equal  to  its  upper  bound,  since  the 
probability  of  failure  (and  higher  cost,  by  Fact  7.1)  is  high;  far  from 
zero,  V  (x  ,r  =*1)  is  equal  to  its  lower  bound  since  the  risk  of  failure 

A 

is  kept  small. 

In  this  section  we  have  specified  certain  parametric  restrictions 
(i.e.  (7.16)  -  (7.19),  (7.22)  -  (7.23)  for  which  the  single  form- 
transition  problem  (7.2)  -  (7.5)  has  a  simplified  solution  structure. 

This  reduced  class  is  rich  enough  to  include  problems  with  conflicting 
control  goals  and  some  with  commensurate  ones,  however.  These  two  cases 
will  be  studied  in  greater  detail  in  the  next  two  sections. 


figure  6.6.  The  general  shapes  of  V.  (x.. 


7 . 5  Active  Hedging  When  Reliability  and  Performance  Goals  are  Commensurate 

As  we  have  discussed  previously,  the  optimal  JLQ  controllers  for 
problems  having  x-dependent  form  processes  (which  are  the  subject  of 
chapters  5-7)  are  qualitatively  different  from  the  JLQ  controllers  for 
Markovian  form  systems  (as  in  chapter  3) ,  in  that  they  can  use  the  control 
to  change  the  probabilities  of  form  transitions.  That  is,  they  actively 
hedge  ^ .  In  this  section  we  will  study  this  active  hedging  behavior  for 
a  class  of  JLQ  problems  with  x-dependent  form  transitions  where  the  system 
reliability  and  performance  goals  are  commensurate.  Our  purpose  here  is 
to  illustrate  how  the  optimal  controller  uses  active  hedging  to  achieve 
fault  tolerance. in  the  next  section  we  will  consider  the  use  of  active 
hedging  when  the  system  performance  and  reliability  goals  are  at  cross¬ 
purposes  . 

We  will  consider  here  the  "case  1"  single  form-transition  problem  of 
the  last  section,  as  (N-k)  increases.  Under  certain  additional  para¬ 
metric  conditions  (that  are  derived  here) ,  the  optimal  JLQ  controller 
at  all  times  k  =  N,  N-l,...,k^  can  be  specified  by  recursive  difference 

rk=l)  can  be  obtained  without 
performing  the  various  comparisons  and  tests  of  the  general  solution 
algorithm. that  is  flowcharted  in  section  7.2.  In  this  section  we 
will  discuss  the  optimal  steady-state  controller  for  this  example, 
as  (N-k)  approaches  infinity. 

The  discussion  here  will  be  carried  out  primarily  via  figures.  A 
detailed  technical  developmentparalleling  this  section  is  contained  in 


equations  that  is,  V^x^, 


V15 


and 


VV 


appendices  C.13  and  C.14. 


Let  us  consider  the  scalar,  single  form-transition  problem  (7.2)~(7.5) 
where  Facts  7.1,  7.2  and  7.3(1)  hold.  We  also  assume  that  (7.12)  holds. 
That  is,  -a  are  the  grid  points  of  the  x^  composite  partition  that 
are  closest  to  zero.  In  figure  7.14  we  collect  (for  convenience)  the 
curves  VfVjVr11 

for  this  problem. 

Let  us  consider  now  the  candidate  cost-to-go  functions  (of  X  „)  that 

N-2 

are  eligible^  for  V  (x„  _,r„  =1).  In  figure  7.15  we  show  the  candi 

N-2  N-2  N-2 

date  functions  Vt,fi  (i=l . 7).  Recall  that  these  functions  of  x 

N—  2  ”  c- 

(2) 

coincide  with  solutions  of  the  constrained  (in  x^  )  problems  : 


min 

u 

N-2 
s.  t. 


|  VN-2  <XN-2,rN-2_1)| 


XN-1  6  V-l{l)  ~  (YN-l(l“1)'  YN-l(l)) 


for  those  x„  values  where  the  resulting  value  of  x„  ,  is  in  the  interior 
N— Z  N“1  - 

ofVi(i)- 

The  following  relationships  that  are  pictured  in  figure  7.15  are 
verified  in  appendix  C.13: 


4 , U  3,U  _  5,U  1,U  7,u 

V  <  V  I  V  <  V  =  V 
N-2  N-2  N-2  N-2  N-2 


at  all  x 


N-2 


(7.26) 


VN^  >Vn-2  for  a11  xn-2  excePt  equality  at  xN_2  =  9N_2  (3) /a  (1)« 

(7.27) 

>  VNl"  for  all  xN_2  except  equality  at  xN_2  =  SN_  ,  (6) /a  ( 1 ) . 

(7.28) 

In  addition  to  the  seven  candidate  cost  functions  that  are  shown  in  fig7-lc>. 


(1) 


In  the  sense  of  section  7.2 


(2)  A  A 

Here  Y.,  ,  (0)  *  -oc  and  y  (7)  =  +* 

N- I  N- 1 
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there  axe  two  other  eligible  candidate  functions  for  v  _ (x  _,r  =1)  . 

N-2  N-2  N-2 

These  are  those  "constrained" functions  ^  which, according  to  Prop.  5.2 
(and  block  15  of  the  algorithm  of  section  7,2)  ,  correspond  to  hedging  to 

A 

the  lower  cost  side  of  a  vN_j_  ^xn_iI  rN-2:3^  discontinuity;  that  is,  driving 
to  *  ~a+  or  x^_^  =  a  (see  figure  7.14(d)).  These  candidate  cost 

functions  are: 

VN-2  (xN-2'rN-2=1)  driving  x^  to  -a+ 

(left  boundary  of  A^  ^(4)) 


v  (X  r  =11 

N-2  N-2'  N-2  driving  x  to  ct- 

(right  boundary  of  ^  ^ (4) ). 

The  parameters  of  V^2  and  Vn-2  are  aPPendix  C.13. 

4  L  4  R 

If  we  superimpose  the  curves  of  V  '  and  V  '  on  figure  7.15  and 

N— 2  N— 2 

(2) 

compare  the  regions  of  -  validity  of  each  curve  ,  then  we  can 

obtain  the  optimal  expected  cost-to-go 

Vl'Vi'V!'11  =  ■*“  ,  f|,.-!lV2'r>-2'l)} 

1=1, . . . ,7  u  \  f 

N—2 

S.t. 

Vl 

4  L  4  R 

(see  chapter  5)  for  each  value  of  x^  2*  However  V^'2  and  VN'2 


(1) 


Corresponding  to  driving  x 

. YfJ.1(6)  of  theN_1 

A  (1) , . . . (7) . 


to  one  of  the  boundary  points 
composite  partition  intervals 


(2) 


As  in  section  7.2 


can  be  in  three  different  graphical  positions  relative  to  the  V  ' 

N-2 

depending  upon  the  values  of  the  problem  parameters.  We  will  examine  each 
of  these  three  possibilities  and  analyze  the  relationships  between  the 
qualitative  properties  of  the  optimal  controllers  which  they  specify  and 
the  problem  parameters. 

We  first  note  that,  by  definition: 

VN-2  >  VN-2  except  equality  at  x^  =  0N_2(4)/a(l) 

(7.29) 

VN-2  >  VN-2  except  equality  at  XN_2 * ®N_2 (4) /a (1) • 

(7.30) 

4  U 

Now  V '  (x  _,1)  is  the  middlepiece  cost  function  for  v  _ (x„  _  ,r  »1)  , 

N— 2  N—2  N-2  N“2  N— 2 

thus  as  we  have  shown  it  is  also  a  lower  bound  for  this  problem. ^ 

4  U 

Consequently  V  '  must  be  optimal  over  its  entire  valid  domain;  that  is 

N— 2 


VN-2(*N-2'rN-2*1) 


9N-2(4)  0N-2(4) 

a  (1)  1  *N-2  1  a  (1)  •  (7.31) 


From  Proposition  6.1  we  know  that  the  optimal  expected  cost-to-go 
VN-2  ^XN-2'rN-2*^  coincides  with  the  endpiece  cost  function  V^2  for 
x](J>_2  negative  enough,  and  it  coincides  with  vj^2  for  xN_2  large  enough  . 
We  also  know  (from  Proposition  5.1)  that  v  . (x,  _,r  =1)  is  con- 

N— 2  N— 2  N— 2 

tinuous  in  xN_2- 


(1) 


See  Fact  7.3 


for 


Refer  now  to  figure  7.15.  Considering  the  optimality  of  V' 


1/U 


N-2 


.4,0 


xN_2  sufficiently  negative,  of  VN_2  for  xfJ_2  near  zero,  and  of  _N_2 

XN_2  sufficiently  large  and  bearing  in  mind  the  required  monotomicitv 
of  the  optimal  mapping 


for 

1 


W - *  Vl(V2'V2,l)i 

the  remaining  question  that  must  be  resolved  so  as  to  find  V  (x  r  «1 

N-2  N-2,  N-2 

for  this  problem  is  the  following:  how  does  V  (x  ,r  =1)  "get  from" 

N— 2  N-2 

the  V^2  curve  to  the  curve  as  Xn-2  ^ncreases  (an<*  from  v^'2  to 

v7'U)  ? 

N-2J 

The  three  possible  situations  that  can  occur,  depending  upon  the 
values  of  the  problem  parameters,  are  shown  in  figures  7.16  -  7.18  and 
are  sumnarized  in  table  7.3.  Complete  details  for  each  case  appear 
in  appendix  C.13. 

The  first  situation  (shown  in  figure  7.16)  results  in  V  (x  ,r  =; 

N-2  N-2  N-2 

having  2  ^  *  9  pieces2  .  Each  piece  corresponds  to  a  different 

active  hedging  strategy  using  uN_2  and  u^.  in  the  endpieces  VN  (1:1) 

Vn-2(9:1)  1:116  controller  does  not  use  controls  u„  „,u  ,  to  chance 

—  N— 2  N- 1 

the  p(l,2:x)  piece  that  the  x  process  is  in.  in  the  middleDiece 

we  llave  I  i !  <a  >  I  I  <a  •  ^he  isft  and  right  switching 
regions  (5v1;,;L  and  of  x  values  are  divided  into  three  parts: 

•  intervals  of  xN_2  values  where  immediate  hedging-to-a-point 

(1)  From  Proposition  5.3  we  know  that  this  mapping  is  always  monotone 

(2)  This  is  the  upperbound  on  m^d)  ,  according  to  Proposition  5.4 


(to  xN_x  =  -a  with  cost  VN_2(4:l)or  to  xfJ_1  =  ct  with 
cost  V^_2  (6:1))  using  uN_2  is  optimal 

«  intervals  of  xN_2  values  where  the  optimal  strategy 

uses  control  u  ,  to  hedge- to-a-ooint  (to 
N— 1 

+ 

^  =  “a  with  cost  V“n_2(2:1)  or  ^  =  a"  with  cost 

•  intervals  of  x^  values  where  x^_^  and  x^  are  i° 

different  p(l,2:x)  pieces  but  hedging-to-a-point  is  not 

used;  (v.  (3-i\  V„  _ (7:1)  are  the  optimal  costs  here). 

N-2  ;  •  N— 2 


The  second  situation  (shown  in  figure  7.17)  results  in  V.,  „ (x  „,r  =1) 

N-2  N-2  N-2 

having  only  n»N_2(l)  *  7  pieces.  Unlike  the  first  situation  of  figure  7.16 

there  are  no  x^_2  values  from  which  the  optimal  controller  causes  xN_i 
and  x^  to  be  in  different  p(l,2:x)  pieces  without  using  hedging-to-a- 

1  L  1  R 

point.  Here  the  switching  regions  (S'.  ,  S  '_)  are  each  divided  into  two 

N— 2  N-2 

parts ; 

#  intervals  of  x^_2  where  immediate  hedging-to-a-point 

(to  x_,  ,  =  -Ct+  with  cost  V  _(3:1)  or  to  x„  ,  =a 
N“1  N— 1 

with  cost  v  .(5:1))  using  u„  .is  optimal 

N—a  N—  z 

*  intervals  of  XN_2  values  where  the  optimal  strategy 

keeps  in  the  same  p(l,2:x)  piece  as  xN  ^ •  but 

then  uses  control  u„  •,  to  hedge- to-a-point  (to  x„  *  -ct+ 


Figure  7.16;  vN-2^XN-2'rN-2“1^  in  the  first  situation.  The  optimal 
is  indicated  by  the  heavier  line.  The  resulting 
optimal  values  of  x„  and  x„  ,  for  each  of  the  9  pieces 

N  N— 1 

of  Vn_2  ^XN-2'rN-2=1^  arS  alS°  ind*cated'  ( t ;  1) 
denotes  the  t  piece  (from  the  left)  of  v  (x  ,r 


with  cost  Vn_2(2:1)  or  to  xN  =  a  with  cost  VN_2(6:1)J. 

In  the  third  possible  situation  that  can  arise  for  VN_2 ^xn-2' rN-2=^ ' 
as  shown  in  figure  7.18,  the  optimal  controller  has  only  mN_2(l)  =  5 
pieces  (the  same  number  of  pieces  that  VN_^ ^XN-l'rN-l  =  ^  has).  In 
this  situation  the  optimal  controller  drives  x  Linto  the  same  d(1,2:x) 
piece  which 

piece  into  which  it  drives  xN.  That  is,  active  hedging  (if  any)  is 
done  immediately  (ie,  at  time  k  =  N-l)  using  control  uN_2-  The  con¬ 
trol  u^  2  is  used  to  hedge  to  x^_2  “  ~a  (for  Vn-2^2:^"^  or  ^“1  ~  a 
(for  Vn-2(4:1)  * 

Various  aspects  of  these  three  situations  are  summarized  in  table  7.3. 

What  we  would  like  to  do  is  relate  these  various  active  hedging  strategies 

to  the  values  of  the  problem  parameters.  We  first  note  that  from  figures 

7.16  -  7.18  we  can  obtain  the  following  graphical  conditions  related  to 

these  three  possible  situations  for  v„  ~(x„  ,,r„  =1)  in  this  problem. 

N— Z  N— Z  N-Z 

These  are: 

Fact  7.4  :  For  the  problem  (7.2)  -  (7.5)  with  facts  7.1,  7.2,  7.3(1) 

holding  and  (7.12),  we  have  the  following: 

(1)  Situation  (1),  as  in  figure  7.16,  occurs  if  and  only  if  the 
leftmost  intersection  of  the  two  quadratic  functions 

VN-2L  (XN-2}  and  VN-2  {XN-2>  iS  t0  th®  right  °f  6N-2 (3)/a(1) • 

(2)  Situation  (3),  as  in  figure  7.18,  occurs  if  and  only  if 

the  leftmost  intersection  of  V^'^(x„  -)  and  V V ^ ( x„  .)  is  to 

N-2  N“2  N-2  N-2 


the  left  of  or  exactly  at  the  leftmost  intersection  of 


Situation  1 

Situation  2 

Situation 

(Fig, 7.16) 

(Fig. 7. 17) 

(Fig. 7. 18) 

Number  of  pieces  of 

Va'V-a'W11 

9 

7 

5 

Va^'W11 

(maximum 

=  m  (1) 

N-2V  ' 

possible) 

Endpiece  functions 

V2(1:1) 

v2(ljl) 

v2(lsl) 

V2(9j1) 

V2(7s1) 

v2(5sl) 

Middlepiece  function 

V2(5:1) 

VN-2(4j1) 

VN-2(3:1) 

Other  pieces  that  do 

not  involve  hedging-to 

VN-2(3il) 

— 

— 

a  ~  or  -  a  + 

V2(7:1) 

Pieces  involving 

hedging-to-a-point 

VN-2(4!l) 

VN-2(3:1) 

VN-2l3:1> 

with  Vj 

V2(6s1) 

VN-2(5s1) 

v2(4:1) 

Pieces  involving 

hedg ing-to-a-po int 

with  u  , 

N_L (but  not 

V2(2s1) 

v2(2:1) 

— 

with  UN-2> 

V2(8j1) 

v2(6;1) 

Table  7.3  Comparison  of  three  possible  situations 

f°r  V2(V2'V2sl) 
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where  x^  is  given  by  (C.13.36). 


(3)  situation  (2),  as  in  figure  7.17  occurs  if  and  only  if 
(7.32)  and  (7.33)  are  both  not  true.  A  necessary 
condition  is  that  1  1  where  x2  is  given  by 
(C.13.44) . 


Proof:  These  conditions  are  derived  in  appendix  C.13. 

Conditions  (7.32)  -  7.33)  are  implicit  relationships  that  can 
be  tested  for  any  given  set  of  problem  parameters.  These  conditions 
are  quite  complicated,  however,  and  it  is  consequently  very 
difficult  to  deduce  general  analytical  properties  from  them. 

However,  we  can  use  facts  7.1,  7.2,  and  7.3(1)  to  obtain  sufficient 
conditions  that  are  more  easily  analyzed. 


Fact  7.6;  For  the  problem  (7.2)  -  (7.5)  with  facts  7.1,  7.2, 
7.3(1)  and  equation  (7.12)  holding 


(1)  if 


(1  + 


a(l)  < 


b2  (1) 
R(l) 


V2)1 


(u^-o^)  (KN_1(1:2)  +  Q  (2)  -  Q  (1)  -  K^d)) 


\ 


1  + 


(1) 
R(l) 


\  ^  1  +  rTiT1  I(1“w2)  (KT{1)  +  2U))  +  U)2(KN-1(1:2)  +  Q(2)n ) 


then  situation  (1)  (figure  7.16)  describes  VN_2 ^xn-2 ,rN-2=1^ * 


9 


(3)  If 


(7.38) 

Then  situation  (2)  (figure  7.17  describes  VN_2 ^rN-2 ,rN-2=^ 

Proof ;  See  Appendix  C.13. 


O 


Using  (C.13.52)  -  (C.13.53)  in  appendix  C.13,  we  can  compare  (7.32) 


with  (7.34)  and  (7.33)  with  (7.38)  .  We  find  that  for  (7,34)  and  (7,38) 
to  be  necessary  as  well  as  sufficient  we  would  need  either  b(l)  =  0  or 
U2  =  W1  °r  =  All  t^ese  possibilities  are  excluded 

by  facts  7.1  -  7.3.  That  is,  the  conditions  in  fact  7.6  are  always 
sufficient  but  not  necessary. 

Note  that  conditions  (7.34)  -  (7.38)  of  fact  7.6  depend  upon  the  value 
of  a(l)  on  only  one  side  of  each  equation.  This  lets  us  interpret  fact 
7.6  as  a  set  of  conditions  which  relates  values  of  a(l)  (that  is,  the 
stability  of  form  r=l  open  loop  dynamics)  to  the  three  active  hedging 
strategies  of  figures  7.16  -  7.18.  For  sufficiently  small  a(l),  where 
(7.34)  is  satisfied,  we  have  the  optimal  controller  of  figure  7.16.  For 
larger  a(l)  satisfying  (7.37)  -  (7.38)  the  optimal  controller  must 
hedge-to-a-point  with  either  u T  _  or  u .  ,  for  all  _  that  are  not 
in  the  endpieces  and  middlepiece  domains.  For  sufficiently  unstable 
a(l)  satisfying  (7.36)  the  optimal  JLQ  controller  cannot  delay  in  its 
active  hedging  -  for  all  xN_2  inside  the  switching  regions  it  will 
immediately  hedge-to-a-point  using  control  2 .  Thus  for  this  example 

problem  with  commensurate  performance  and  reliability  goals,  greater 
instability  of  the  normal  operation  dynamics  (that  is,  larger  values 
of  a(l)) force  the  controller  to  actively  hedge  sooner.  The  optimal 
controller  must  drive  x  into  the  advantageous1  region  of  p(l,2:x) -sooner 
for  large  a(l)  than  for  small  a(l)  because  the  larger  value  of  a(l) 
tends  to  push  x  deeper  into  the  disadvantageous  region  of  p(l,2:x)  . 

Thus  the  cost  of  hedging-to-a-point  (crossing  into  the  good  p(l,2=x)  piece) 

1 where  the  probability  p(l,2*x)  of  a  (costly)  failure  occurring  is 
smaller  (that  is,  Ixilao). 


In  the  remainder  of  this  section  we  will  examine  the  optimal  JLQ 
controller  for  problems  satisfying  (7.34)  of  fact  7.6.  What  we  find  is 
that  at  all  times  (N-k) ,  the  optimal  JLQ  controller  follows  the  pattern 
of  figure  7.16.  This  lets  us  obtain  a  recursive  description  of 

Vk'Vk'Vk"11  “d  Vk'Vk'W  at  aach  time  “•k)- 

A  typical  v  ,  (X„  ,  ,  r„  ,'iL)  curve  is  shown  in  figure  7.19.  It  has 
N“K  N“K  N“K 

m..  ,  (1)  =  4k+l  pieces: 

N-k 

•  2k  pieces  wholly  to  the  left  of  zero 

•  2k  pieces  wholly  to  the  right  of  zero 

and 

•  the  middlepiece  around  k  =  0. 

The  curve  is  symmetric  about  x^  k  =  0. 


For  xN_fc  in  the  middlepiece  (i.e.,  for  d^_k(2k)  <  xNk<  6N_k (2k+l) ) , 
the  optimal  controller  will  result  in  |xN_k+^|  <  a  for  l  =  l,...,k. 

That  is,  the  x  process  will  be  in  the  advantageous  piece  of  p(l,2,:x)  at 
all  future  times. 

The  first  piece  to  the  left  of  the  middlepiece  (i.e., 


^N-k  ^^”1)  <  xN_k  <  <5^_k  (2k) )  corresponds  to  using  u^_k  to  achieve 
XN-k+l  =  ”  a+-  That  is,  we  use  the  control  to  immediately  hedge 

to  a  point.  Then  the  optimal  controller  will  keep  the  x  process  in 


the  middlepiece  from  time  (N-k)  +  2  through  time  N.  Similarly  the 
first  piece  to  the  right  of  the  middlepiece  (i.e.  <$N_k(2k+l)  <  x^_k 
<$N  k(2k  +  2))  corresponds  to  using  uf}_k  to  immediately  hedge  to 

^Sl-k+l*01  ^and  t0  keeP^n<?  IXN-k+iJ  <  a  ^°r  ^  =  2'3' •••'*)• 


The  second  pieces  of  VN_k (xN_k»rN_k=l)  t0  the  left  and  right  of 
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way  of  thinking  about  future  hedging  options  that  leads  to  the  finite 
time-horizon  approximation  of  the  infinite  time  problem^-  that  is 
developed  in  section  7.7. 

Before  stating  the  solution  to  problem  (7.2)  -  (7.5)  with  facts 
7. 1,7. 2, 7. 3(1)  and  7.6(1)  holding,  let  us  recall  the  following  short¬ 
hand  notation.  Let  VN  k(i:l)  and  u^  ^(i:l)  denote  the  i^  piece  of 
VN-k(XN-k,rN-k=1)  and  “N-k^-k^N-k""15  '  resPectively ,  counting  from 


i  right.  That  is, 

VN-k txN-k'rN-k=1) 

■  Vk(iil> 

Vk'Vk'Vk’11 

-  Vk(ill> 

(7.39) 


for  Vk<i-1)  i  Vki  Vklil 


i  «  1,..., (4k+l) 


Proposition  7.7:  For  the  problem  (7.2)-(7.5)  with  facts  7.1,7.2,7.3d) 

and  7.6(1)  holding,  the  optimal  JLQ  controller  can  be  described  by: 

1\  The  number  of  pieces  of  V  (x  ,r  =1)  and 
'  N-k.  N-k  N-k 

‘WWW1’  “  time  N'k>  U 

m  rn  »  4k  +  1  ... 


For  the  general  problem  of  chapter  5 


This  is  the  maximum  possible  number  of  pieces.  That  is 


all  eligible  candidate  cost-to-go  functions  (by  Proposition  5.2) 
are  optimal  over  some  portion  of  their  regions  of  validity. 


These  eligible  candidate  costs  axe 


N-k 
,2k, L 

Vk 

2k, R 


t=l - - -  4k- 1 


(driving  x^i  into  Vktl1 
=  -  +) 

(hedging  to  +1  a 


(hedging  to  xs_k+1  =  *  ) 


31  Vk'Vi'Vi’11  “d  Vk'Vk'Vk'11  “* 

symmetric  about  x  =•  0.  That  is, 

N-K 


V-k  (i-’D, 


-  (4k  +  2  -  i :  1)  ? 

N— K. 


!  XN-k 


N_k(4k+  2  -  i :  1) 


!  XN-k 


W11  -  *  Vk<4,t  + 1  • u 


for  i-i,2 . (2JC+1) .  (Here  5  (0)  =  - 

N— K 


4)  The  closest  grid  points  to  zero  in  the  composite  x^  ^ 
partition  are  +  a. 

That  is 


(7.41) 


(7.42) 


0.43) 


N-k+lU) 

=* 

6s 

Wl12*-” 

=5 

-a 

N-k+l(2k) 

= 

a 

N-k+1*^ 

* 

s 

"N 

i  =  1, . . . ,2k-2 


(7.44) 


N-k+1 


<j-2) 


i  =  2k+l, . . . ,4k- 2 


5)  The  odd-numbered  pieces  of  V  ,  (xx.  ,  ,r  ,  =1)  and 
-  N-k  N-k  N-k 

VkV'Vk’11  “e  9ive"  by 
Vk‘i:1)  =  Vk  Vk(i:1> 

Vk(i:1)  *  -Vk(i!l>  Vk 

optimal 

£°r  Vk11’11  i  Vk  i  SN-k(i> 

i  =  1,3,5,..., (4k+l) 


These  costs  are  optimal  for  x^  values  where  the  best 
strategy  is  not  to  use  any  of  the  controls  u^  , ...,u  ^ 

to  hedge- to- a-point.  They  include  the  left-endpiece 


^-k111  -  Vk(1;l> 


the  middle -piece 


VN-1  (1)  5  *(1>  "  Vk(2k+1:1)  ' 


and  the  right-endpiece 


^-k111  ■  Vkl<k  +  U1) 


The  even-numbered  Dieces  nf  v  rv  ..  , »  .  , 

— -  N-k1  N-k,rN-k_1)  and 

UN-k(XN-k,rN-k=1)  are  given  by 

Vk(ii11  ■  Vk2  Vktl,1> +  Vk  Vk(i!l)  *  s. 


Vk11'11  -  -Vk(i:1)  Vk +  Vk<i:1> 


optimal  <5„  ,  (i-i)  <  x  .  <  cS 


These  costs  correspond  to  actively  hedging- to- a-ooint 


at  one  future  time.  Specifically , at  each  time  (N-k)<N  and 
for  each  i  =  l,2,...,k: 

V  .  (2i:l)  is  the  cost  associated  with  using  control 
N-k 

u„  .  to  hedge-to-a-point.  That  is,  it  is  the  expected 
N- 1 

cost-to-go  which  results  if 

Wt-i  "“p*  Vm  <  -« 

for  each  4  k-i  (when  k>  2) 


and  then 


hedges  to  x  =  -a 

N-i+1 


and  then 

UN-k+Z-l  keSP  _0t  <  ’Sj-k+i  <  0  for  Z=  k-i+2,... 

(when  k>  2)  . 


So  VN  ^(2k:l)  corresponds  to  hedging-to-a-Doint  immediately 
(using  u„  .)  and  V  .  (2:1)  corresponds  to  hedging-to-a-point 

N—  K  N— K 

at  the  last  time  (using  uN_^) .  Similarly,  for  v  ^ (i : 1) 
(where  j  =  2k+2,...,4k)  the  optimal  controller  uses 


The  cost  parameters  in  (7.45)-(7.48)  are  given  recursivelv  by 
the  following  set  of  coupled  difference  eauations: 


•  For  k  =  N-l/N-2, . . . . , 


the  form  r=2  parameters  obey 


V1: 


2)  = 


a  (2)  R (2)  [Kk+1(l:2)  +  Q (2) ] 

R  (2)  +  b2{2)  [Kk+1(l:2)  +  Q (2) ] 


Vlt2)  -  zrtrtfe)  vu2) 


with 


^(1:2)  =  Kt(2) 


•  For  k=l,2,  —  ,N 


a2U)  RC1)  yk+1Uk) 

R(D  +  b2d)  Vk+I(2k) 


where 


Wi1!tl  -  * 5,1,1  * 


*  01,  (K  (Is?)  +  Q(2)  ] 

1  N-k+1 


with 


Kj|(l>  =  KT(1) 


and 


Vk  (2k+l  :1)  = 


b  (1) 


a(l)  R (1) 


KN_k(2k+l:l) 


(Here  ^(1)  i  £„(!>  =  K^U>  =  Vk  I**1'1*’ 


LB 


»  For  k=l,2,...,N  and  i=l,2,3, — ,2k-l 

S-k+l1 


a2(l)  R ( 1)  K.  U) 


KN-k(l:1)  “  .  ,_2  a 


R(l)  +  b  (1)  K  .  (i) 


(7.55) 


By  symmetry  we  have  (for  each  k=l,2,.... 


N)  for  i=l,2,...2k; 


K^Usl)  =  KN_k(4k+2-i:l)  (7.72) 

HN-k(iil)  =  _HN-k(4k+2“i:L)  (7.73) 
GN-k(i:1)  =  GN-k(4k+2_;l:1)  (7.74) 
FN-k(i:1)  =  FN-k(4k+2"i:1)  (7.75) 
LN-k(i:1)  =  'LN_k(4k+2“i:1)  (7*76) 


At  each  time  k  =  1,2,..,  we  have  the  following  relationships: 

Vkl!M!l>  <  W2*-1'1’  <  k  Vkl3!l>  <  Vk(l!l)  < 

5  Vkl2:1)  <  Vkl4iU  4  •••  4  Vk<2k:1) 

(7.77) 

SH-k(2i*l,1>  4  Vk<2ia)  4  W2 1711 11  l7-781 

I- 1  /  2  f  •  •  •  f  K 

Vk,!M1  <  Vk(!kH1  <  Vk12*-11  <  Vk(2k-3>  < 

<  Vk131  <  W1*  <  W2>  <  Vk<4> 

•••  <  Vk,!kl 

Here  Vk(2k+1:1>  '  Ck(1>  ’  ^-k11’  and 

Vk«i!i)  =Ck(1>  • 


(7.79) 


10)  For  i  =  1, . . .  ,k 

V  ,  (2i:l)  >  V  ,  (2i+l : 1)  except  (7.80) 

N-k  N-k 

equality  at  xN_k  =  <$N_k(2i) 

and 

5n  k(2i-l)  is  at  the  leftmost  intersection  of 

Vk(2i:1)  VN-k(2i_1:1)  • 

Proo f :  The  proof  of  this  theorem  involves  an  induction  on  (N-k) ,  starting 

with  k=2.  The  proof  is  developed  in  appendix  C.14  □ 

The  remarkable  thing  about  this  problem  is  that  the  optimal  control 
law  and  expected  cost-to-go  at  each  time  k=kQ,...,N  ('for  any  finite  time 
horizon  problem)  can  be  computed  recursively  from  a  (growing)  number 
of  difference  equations  running  backwards  in  time.  We  need  not  follow 
all  the  flowchart  comparisons  and  tests  of  section  7.2  for  this  class 
of  problems.  Thus  this  problem  lends  itself  to  detailed  analysis 
and  interpretation. 

In  particular,  for  this  problem  we  can  clearly  see  what  each  piece 
of  the  controllex  is  trying  to  accomplish.  Refer  again  to  fig.  7.19, 
where  a  typical  k*rN  k=1)  shown*  The  middlepiece  is  the 

lower  bound  cost  associated  with  not  having  to  hedge-to-a-point ,  be¬ 
cause  xN_k+^  will  be  in  the  p(l,2:x)  region  that  is  best  according  to 
both  the  system  performance  and  reliability  goals.  The  endpieces  are 
highest 
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Optimal  control  law  u  (x„  ,  ,r  =1)  for  the  Droblems 

N-K  N-K  N-K 


of  Proposition  7.7 


in  cost  because  the  controller  never  drives  x  into  the  good  p(l,2:x) 
region.  The  even-numbered  pieces  correspond  to  hedging-to-a-point  at 
successively  further  times  in  the  future  (as  we  move  away  from  the  middle 
piece  in  figure  7.19),  with  successively  higher  costs  being  incurred. 

In  figure  7.20  the  optimal  control  law  u^_k  ^xn-1c ' rN-k=1  ^  is 
shown  for  these  problems.  Note  that  this  control  law  is  discontinuous 
at£$N_k(l)|  i=l, 3, . . . , (2k-l) }  ,  where  vN_k (xN-k,rN-k=1)  is  not 
differentiable.  In  figure  7.21  the  optimal  napping  from  xN_k  to 
xN  k+1  (given  r^  k=l)  is  graphed.  There  is  a  region  of  avoidance 
associated  with  each  control  law  discontinuity.  Thus  at  all  times 
(N-k)  we  have  the  type  of  behavior  illustrated  at  time  (N-l)  in 
figures  6.10  -  6.12. 

Let  us  now  consider  the  JLQ  optimal  controller  of  Proposition  7.7 
as  the  time  horizon  (N-k)  grows  large.  From  (7.40)  we  see  that  the 
number  of  pieces  of  the  optimal  controller  grows  without  bound  as 
(N-k)  goes  to  infinity.  Thus  the  exact  infinite  time  horizon  optimal 
controller  is  not  obtainable  precisely  via  any  finite  algorithm. 

However,  (as  we  hinted  prior  to  stating  Prop.  7.7),  as  (N-k)  grows 
large,  many  of  the  additional  controller  pieces  correspond  to  hedging 
very  far  into  the  future.  As  might  be  expected,  the  savings  obtained 
p(l,2:x)  domain  that  x  is  in  becomes  small  as  the  time  when  this  change 
is  effected  becomes  distant.  As  a  result,  the  structure  of  the  optimal 
effected  becomes  distant.  As  a  result,  the  structure  of  the  optimal 
controller  does  converge  to  a  steady-state  controller  which  we  can 
approximate  with  arbitrarily  small  error  by  choosing  suitably  large  (N-k) 

From  (7.57)  it  is  straightforward  to  show  that  for  any  k,  the 


k  time-sequence s  of  odd-indexed  cost  parameters ; 

{KN_k(2i-l:l)}^=1  (i=l, . . . ,k) 

are  each  given  by  the  same  recursive  difference  equation: 

•  compute  {KN_k(2i-l:l) via  (7.55)  -  (7.56) 
from  terminal  condition 

KN_1(2i-l:l)  =  ij.d)  .  (7.81) 

Only  the  terminal  times  and  conditions  are  different  in  the  computation 
of  each  of  these  odd- indexed  sequences.  At  a  fixed  time  (N-k) ,  these 
k  cost  parameters 

(i^_k(2i-lsl),  i=l,...k) 

correspond  to  all  of  the  pieces  of  VN_k  to  the  left  of  the 

middlepiece  where  the  optimal  controller  does  not  hedge-to-a-point 
with  controls  uN_jc»  •  •  • '9^-1*  Here  corresponds  to  the  left  end- 
piece  Kj^dsDf  i  =  k  corresponds  to  xN_k+1  <  -<*  but  all  xN_k+2 •  •  •  •  XN 
greater  than  -a;  for  an  arbitrary  i  =  2,...,k,  we  have  xN_k+^ , . . . 'xN_i+^ 

all  less  than  -a  and  xN_^+2' _ 'xn  a11  9reater  than  -a  . 

We  know'*'  that  ^(1)  increases  monotonically  toK^e(l)as 
(N-k)  *►  00  and  K^_k(l:2)  -*■  (1:2)  is  monotonely  increasing  as  well. 

Thus 

lim  (2i-l:l)  =  (1)  (7.82) 

(N-k)-*co 

and  this  convergence  is  monotone  increasing.  That  is,  if  one  looks  at 


1 


From  facts  7.1  -  7.3 


a  particular  odd-indexed  cost  parameter  K  (2i-l:l)  (for  1  <  i  <  k; 

N— k  —  — 

counting  from  the  left)  and  lets  (N-k)-*  ®  ,  then  this  cost  parameter 

Le 

looks  increasingly  like  the  limiting  left-endpiece  cost  parameter  K  (1) . 

00 

That  is,  changing  the  transition  probability  piece  in  the  far  future 
looks  like  never  changing  at  all-  This  is  not  suprising  when  one  considers 
that  the  range  of  x  values  where 


Vk(2i-lsl)  =  W  Vk{2i'1:1) 

is  optimal  (i.e.  the  interval  (<$  (2i-2)  ,  5  (2i-l)))  moves  further 

and  further  leftwards  as  (N-k)  increases. 

Similarly,  from  (7.59)  -  (7.56)  we  see  that  the  k  time  sequences 
of  even- indexed  cost  parameters: 


^■KN-k(2;L:1)^k=a 


(i— 1 ,2, • . . ,k) 


are  each  given  by  the  same  recursive  difference  equation: 


•  Compute  (2i:l)  via 


(7.55)  -  (7.56)  from  terminal  condition 


,  (2i : 1) 


a  (l)R(l) 
b2(l) 


(7.83) 


Only  the  terminal  time  is  different  for  each  i.  At  a  fixed  time 
(N-k),  these  k  cost  parameters  {Kj._k(2i:l)  i  =  l,...,k}  correspond 
to  all  of  the  pieces  ofvN_j<  to  the  le£t  o£  £rom 

which  the  optimal  controller  will  hedge- to- a-point  at  some  (single) 
future  time  (i.e.  with  one  of  the  controls  u^_^, .  .  .  ,u^_^)  •  Here  i  =  1 
corresponds  to  using  u^  to  drive  x^  to  -a+;  i=2  corresponds  to  using 
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u.  _  to  drive  x._  .  to  -a  ;  i=k  corresponds  to  immediate  hedgxng-to-a- 
N-2  N-l  1 

point  (using  uN_^  to  obtain  xN_k+1  ~  ~a+) • 

Thus  from  facts  7.1  -  7.3 


lim  K  (2i:l)  =  (1) 

(N-k)-*» 

as  well;  however  this  convergence  is  monotone  decreasing . 


It  is  immediate  from  (7.65)  and  (7.66)  that 


lim  H  ,  (2i:l)  =  0 

oho-  li‘k 


G»-k(2ia)  ■  0 

(N-k)-*» 


for  i»l,2,..,  .  So  for  all  i=l,2,.,. 


lim  V  ,  (i :  1)  *  Vw  (1)  .  (7. 

(N-k)^» 

That  is,  as  (N-k)  increases  for  this  problem,  the  number  of  pieces  in 
V„  i  ,  »r„  ,  )  increases  without  bound  and,  for  anv  fixed  i,  the 

N—K  N“K  N“K 

Le 

function  V,  .  (i:l)  approaches  the  endpiece  function  V  (1)  in  shape. 

N— K 

The  odd-indexed  pieces  V„  .  ( i : 1)  (i=l,3,5, . . .)  approach  this  limit1 

N— K 

from  below  (this  follows  from  (7.77));  the  even-indexed  pieces 
VN_k(i:l)  (i=2,4,6, . . .)  approach  it  from  above. 

From  parts  2,5  and  6  of  Proposition  7.7  and  from  (7.78)  it  fol¬ 
lows  that  the  width  of  the  switching  regions  S1'?  ,  (that  is, 

N-k  N-K 

the  range  of  x  values  where  active  hedging  at  some  future  time  is 


at  each  x 


optimal)  increases  as  (N-k)  increases.  However  (7.84)  implies  that 
as  (N-k)  increases ,  active  hedging  in  the  far  future  yields  decreas¬ 
ing  savings  in  the  optimal  cost  relative  to  the  endpiece  cost  func- 
tion2  .-(I)* 

Condition  (7.84)  does  not  easily  yield  any  further  information 

about  the  structure  of  the  infinite  time  horizon  solution,  however. 

In  order  to  obtain  further  understanding  of  the  infinite  time  horizon 

problem  it  is  useful  to  alter  the  indexing  of  the  controller  pieces. 

Let  us  count  from  the  middlepiece  outwards  instead  of  from  left  to 

right.  This  indexing  will  use  the  notation  "  <  >"  ,  Let 


V„  .  <  i  > 

N-k 


Vk  <  1  > 


UN-k  <  1  > 


G  ,  <  i  > 

N-k 


Vk  <A> 


Lm  .  <  i  > 

N-k 


F  ,  <  i  > 

N-k 


for  i  =  -2k,  -2k+l , . . . , -1,0 , 1 , 2 , . . . ,2k  . 


.  th 


denote  the  i  piece  to  the  left  of  zero  if  i  <  0  (and  to  the  right  of 
zero  if  i  >  0) .  Here  i=0  denotes  the  middlepiece.  Similarly  let 


Vk  <  1  =• 


i*-2k,-2k+l,-l,l,2, ...  ,2k 


.  th 


denote  the  i  joining  point  of  V  ,  (x„  .  ,r  ,  =1)  to  the  left  or  right 

N-k  N-k  N-k 

of  zero. 


that  is,  for  Dieces  far  from  zero. 


which  is  also  an  upperbound  for  V  (x  ,r  =1)  for  this 
problem  by  Fact  7.3.  N-K  N_* 


Thus  the  middlepiece  at  time  (N-k)  is 


We  already  know  the  steady-state  behavior  of  the  middlepiece  and 
endpieces.  Using  the  <  >  notation  we  can  summarize  and  amplify 
the  discussion  above  by  the  following: 

Proposition  7.8:  For  the  problem  of  Proposition  7.7,  as  (N-k)-*» 
we  have: 


1.  The  number  of  pieces  =  4k+l  of  the  controller 

becomes  countably  infinite. 


2.  In  form  r=2  the  expected  cost-to-go  converges  monotonely 
as  (N-k)  increases  to  K  (1:2)  given  by  (6.98) ,  and 
Ln_^(1:2)  converges  to 


b  (2) 


L  (1:2) 

OO 


a  (2)  R(2)  » 


K  (1:2) 


3.  The  middlepiece  functions  (0f  x)  for  k=l,2,...,N) 


Vk 

<0> 

“  Vk<2k+1:1) 

UN-k 

<0> 

"  Vk(2k+l:l) 

converge  monotonely  as  (N-k)**00  to  the  steady-state  functions 


where 


v  <o>  =  \rLM  { i)  =  x2  k  <o> 

00  00  00 

LM 

u  <0>  =  u  (1)  =  -L  <0>  x 

00  00  00 


LM 

K  <0>  =  K  (1) 

00  OO 


lim 

(N-k)  -*» 


N-k 


<0>  is  the 


unique  positive  solution  of  (6.107): 


1 


From  Propositions  6.2  and  6.4,  and  (6.106)-  (6.107) 


*»<0> 


and 


-  |RU)  [1-a  (1)  (1-^)] 


+  b  ( 1)  [Q  (1)  +  u1(K00(1:2)+Q(2)-Q(1) 


R(l)  [1-a  (1)  (1-a^)] 


+  b  (1)  [Q (1)  +  <^(^(1:2)  +  Q (2)  -  Q (1) )  ] 


+  4a2(l)  R (1)  b2(l)  (l-u)1) 


Q(l) 

+ 

/Kw(li2) 

\+  Q<2>  -  Q 


2  b  (1)  (1  -  w  ) 


I-oo<0>  =  l“(1)  *  lim  L  <o> 

(N-k)"*°°  N~k 


is  given  by 


*  a(l)  R (1)  V°> 

4.  The  endpiece  functions  (of  x)  for  k»l,2,...,N 


W-23* 

=  VLe  (1) 
N-kVAi 

=  Vk(l!l) 

Vk<2k> 

=  V*e  (1) 
N-k'XJ 

=  VN.k(4k+l:l) 

Vk^21" 

L6  . 

*  Vx111 

-  Vk(lil) 

Vk" 210 

■  Vi(l> 

=  uN_k(4k+l:l) 

converge  monotonelyas  (N-k)-x»  to  the  steady-state 


5. 


The  functions 


Vk<t2i>  ■  VkVk<t2i> 
Vk  <t2i>  -  -Vk  <i2i>  Vk 


(for  i=l,2,...,k  and  k=l,2,...,N) 

converge  monotonely  to  (N-k)-*30  to  the  functions  (of  x) 

Vgg  <±2i>  =»  X2  <±2i> 
u^  <±2i>  =  <±2i>  x 


for  i=l,2,3, . . .  where 

a2(l)R(l)  [(1-w  )  (K5w<-2(i-l)>+Q(l))  +  W  (Q(2)+Kw(l:2))  ] 

<±2i>  *  - - - 

R(l)+b2(l)]  (1-U2)  (Kto<-2(i-l)>  +  Q  (1) )  +  W2(Q(2)+K^(1:2))] 

(7.87) 


and 


L  <-2i> 
00 


b(l) 

a(l)R(l) 


V-2i> 


=  -L <2i> 
00 


These  pieces  of  the  limiting  controller  correspond  to  never  hedging- 
to-a-point.  For  V=0<±2i>  ,  the  x  process  stays  outside  of  the 
advantageous  p(l,2:x)  piece  (-a ,a)  until  i  time  steps  in  the  future, 
after  which  it  stays  inside  (-a, a)  forever. 


The  functions  (for  k=l,2,...,N) 


V  .  ,  <±1>  *  x  K  ,  <±1>  +  H  ,  <±1>  x  +  G  ,  <±15 


6. 


U  <±1>  =  -L  <±1>  X  +  F  <±1> 

00  00  CO 

at  all  k,  where 

Lflo<-l>  =  LN_k(2k:l)  =  a(l)/b(l)  =  -Loo<l>  (7.91) 

Foo<±l>  F^k(2k:l)  *  -a/b  (1)  (7.92) 

These  pieces  of  the  limiting  controller  correspond  to  hedging  to-a- 
point  immediately  (using  the  very  next  control  input) . 

7.  For  k*2,...,N  and  for  each  fixed  ie  {l,...,k-l} 

Vk<t(2i+1)>  ■  Vk<(2U1»  +  Vk<ll2iHI>  Vk  +  Wi<2i+1)> 

and 

W±(2i+1)>  .  -Vit<t(2iU»  Vk  ♦  Vk<t(2i+ll> 


407 


converge  monotonely  as  (N-k)-*»  to  the  functions  (of  x) 


IS 


Voo<±(2i+1)>=  x  Kao<±(21+l)>  +  Hoo<±(2i+1)>  x  +  G0O<±(2i+l)> 


uoo<±(2i+l)>=  -^<±(21+1)  >  +  Foo<±(2i+l)> 


where 


K  <± (21+1) >  =  a  (l)R(l) 


(1-0J2)  /Kso<±(2i-l)+l)>  \  +  U2  /  Q  (2) 


v-K2(D 


^(1:2), 


R(l)  +b2(l)  I  (1-W2)/K-»<±(2i*1)+1)>V  U>,  /Q[2) 


QU) 


2  + 
Wl:2)/ 


(7.93) 


Loo<_(2;l+1)>  a(l)R(l) 


K  <-2i-l>  = 


-L  <2i+l> 

CO 


(7.94) 


-H  <2i+l>  =  H  <-(2i+l)>  = 


a(l)R(l)  (l-w2)  Hao<~2  (i-1) -1> 


R(l)+b2  (!)(!-'-*) 2  ) 


U  K00<-2(i-l)-l>  V  0)  /Q(2) 
A +  2(11  / 


K  (1 :2) 

00  i 


(7.95) 


Foo<±(2i+1)>  =  2a  (1)  R  ( 1 )  “« 


H  <- (2i+l) >  = 


-  F  <2i+l> 
00 


(7.96) 


Gm<±)  21+1)  >  *  (l-w2)  G00<±(2i-1)+1)> 


-  b2(l)  (l-w2)2(Hoo<±(2(i-l)>)2 


R(l)  +  b2(l) 

'(Koo<±(2(i-l)+l)>)  ( 1— 0J2  )  -Ki>2  ,Q(  1)  r 

♦  ) 

U  \ K  (1:2)  / 

30 

These  pieces  of  the  limiting  controller  correspond  to  hedging-to- 
a-point  using  the  i^  control  after  the  one  that  is  immediately  applied. 
That  is,  they  correspond  to  hedging-to-a-point  i  time  steps  in  the 
future . 


8.  As  (N-k)-*°°  the  joining  point 

converges  monotonely  to 


W2kl 


o  <-l> 


-a 


=  lim  6  <-l>  ... 

N-k)-*00  N'k  a(1) 


/1+b2<2» 

fT  « 

(i-V  /vo>  \ 

*,u 

\+  S(l)/ 

v 

(Kot(1:2)-K2(2)) 

(7.98) 


and  v<+l>  $  <+l>  *  -<S  <-l> 

N-k  00  00 


9.  For  k»2,3,...,N  and  each  fixed  ie(l, . . . ,k-l}  ,  the  joining  points 


,<-(2i+l)>  converge  monotonely  as  (N-k)-*00  to 


6  <-(2i+l)> 
00 


6  <-(2(i-l)+l)> 
00 


a(l) 


2 

_ 

1  +  b  (1) 

(l-o)2)  /kbo<-  (2  (i-1)  +1)  >)\ 

1  +  R(l) 

\+  Qd)  / 

^(^(1:2)  +  Q  (2) ) 

(7.99) 


=  -<5  <-<2i+l»  . 

00 


and  6  <+(2i+l)>+  6  <2i+l)> 


10 


For  k=l,2,...,N  and  each  fixed  i  G{l,...,k}  , 


<5  <-2i> 

N-k 


converges  to 


rH  <2i+l> ) 


6  <-2i>  =  -H  <-2i+l>  - J  -4 [K  <-2i+l>  -  K  <-2i>]  G  <-2i+l> 

00  OO  v  00  00  QO 

2 [K  <-2i+l>  -  K  <— 2i> 3 

00  00 


(7 


and  5„  .  <+2i>  6  <2i>  =  -<5  <-2i> 

N-k  00  00 


11.  For  the  limiting  problem  solution  parameters,  as  (N-k)-*» 
following  relationships  hold: 


(i) 

K  <0> 
oo 

< 

K  <2>  < 

00 

(ii) 

K  <-l> 

> 

K  <-3>  > 

00 

00 

(iii) 

6  <-2i- 
00 

■1> 

<  <S  <— 2  i>  < 

OO 

nee 


v4>  <  ....  iCf1) 

oo<-5>  >  ■  ....>Kie(l) 

<5  <»2i+l> 

00 


6  <2i+l  >  -  6  <2i-l  >  >  0 

00  00 

6  <-2i+l>  -  <5  <-2i-l>  >  0 

00  00 


.100) 


the 


(7.101) 

(7.102) 

(7.103) 


(7.104) 


The  proof  of  this  proposition  follows  directly  from  Proposition  7.7. 
and  our  previous  discussion.  Proposition  7.8  says  that  as  the  time 

horizon  becomes  infinite,  the  number  of  pieces  in  the  optimal  controller 
also  becomes  infinite  but  that  each  piece  (counting  from  the  center  out) 
converges  to  a  constant  steady  state  function  that  is  optimal  over  a 
constant  steady-state  interval  of  x  values.  From  (7.104)  we  see  that 
the  width  of  the  switching  regions  will  grow  without  bound  as  (N-k) 


increases . 

From  Prop.  7.8  it  is  clear  that  we  cannot  implement  precisely 

a  steady-state  JLQ  controller  for  the  infinite  time  horizon  problem  using 

a  finite  algorithm,  because  there  are  infinitely  many  controller  pieces. 

However  the  convergence  of  cost  pieces  to  vLe(l)  in  (7.84)  suggests  a  na- 

00 

tural  approximation  of  the  steady-state  controller.  The  idea  is  to  use, 
at  each  time  (N-k) ,  the  true  optimal  controller  for  a  certain  number  of 
pieces  around  zero,  and  approximate  the  rest  of  V  .(x  ,r  =l)by  the 
endpiece  functions.  As  we  will  discuss  in  section  7.7,  this  allows  us  to 


approximate  V  (x„  .  ,r  *1)  arbitrarily  well,  and  to  relate  the  "com- 
N-k  N-k  N-k 

plexity"  of  the  controller  (in  terms  of  the  number  of  pieces  to  be 


solved  for  and 


implemented)  to  the  approximation  error.  This  approximate  controller  is 
essentially  a  finite  look-ahead  scheme;  the  option  of  active  hedging  is 
considered  only  for  a  finite  number  of  future  times.  In  section  7,7  a  finite 
look-ahead  approximation  of  the  optimal  JLQ  controller  is  developed  for 
the  general  class  of  problems  of  chapter  5. 

In  this  section  we  have  examined  the  structure  of  the  optimal  JLQ 
controller  for  a  class  of  problems  having  commensurate  reliability  and 
performance  goals.  This  class  of  problems  has  a  solution  structure  that  is 
particularly  amenable  to  detailed  analysis.  Its  solution  illustrates  the 
way  the  optimal  controller  uses  active  hedging  to  achieve  fault  tolerance 
in  this  commensurate  goals  case.  In  the  next  section  we  will  consider 
the  solution  of  an  illustrative  class  of  problems  with  conflicting 
performance  and  reliability  goals. 


Active  Hedging  When  Reliability  and  Performance  Goals  are 
Conflicting 


In  this  section  we  consider  an  example  class  of  systems  that  illustrate 
how  the  optimal  JLQ  controller  uses  active  hedging  to  achieve  fault- 
tolerance  when  the  system  performance  and  reliability  goals  are  conflicting. 

we  will  examine  here  the  "case  2"  single  form-transition  problem  of 
section  7.4,  as  (N-k)  increases.  Under  certain  additional  parametric 
conditions  (that  are  derived  here)  the  optimal  JLQ  controller  at  all  times 
k  =  N,  N-l,...,kQ  can  be  specified  by  recursive  difference  equations. 

We  will  develop  for  this  conflicting  goals  example  results  analogous 
to  those  of  the  previous  section  for  a  commensurate  goals  example .  We 
are  considering  the  scalar,  single  form-transition  problem  (7.2)  -  (7.5) 
where  Facts  7.1,  7.2  and  7.3(2)  hold.  That  is,  we  have 

0^  >  <d2  (7.105) 

and  figs.  7.9(b,c)1  and  7.13(b)  apply.  We  will  also  assume  that  (7.14) 
holds : 

Thus  are  the  grid  points  of  the  x^  ^  composite  partition  that  are 

closest  to  zero.  Note  that  when  (7.10)  holds  (i.e.,  there  are  local 

2 

minima  in  V_,  , (x„  -,r.t  .=1)),  then  (7.14)  holds  for  all  a(l)  <  1. 

N-l  N-l  N-l 

^"depending  on  whether  (7.10)  holds  or  not, 

2 

see  comments  following  (7.10), 


J  R(l)  +  b2(l)  i^U)  \ 

R(l)  +  b2(l)  Kjj (2)  ' 


In  figure  7.23  we  collect  for  convenience  the  curves  of  V  (x  ,r  =1) , 

N  N  N 

A  A 

VxNirN-i=1)'  v^Vi'vr11  Vi(ViirN-2=1)  for  this  pr°biea- 

2 

For  this  problem  we  have  that  the  endpieces  of  (x  ,r  =1)  coincide 

with  the  lower  bound  function  V^(l)  and  the  middlepiece  coincides  with 

the  upper  bound  (the  opposite  of  section  7.5).  The  performance  and 

reliability  goals  of  the  optimal  controller  conflict  because  driving  x 

near  zero  (to  the  perfomance  goal)  necessitates  driving  x  inside  the  region 

(-a, a  )  where  the  probability  of  failure  is  high.  Thus  near  zero, 

Vk(x^*r^*l)  reaches  its  upper  bound  cost.  Far  from  zero,  V^tx^r  =1) 

approaches  its  lower  bound  because  the  risk  of  failure"^  (which,  by 

Fact  7.1,  is  costly)  is  kept  small. 

Let  us  consider  now  the  candidate  cost-to-go  functions  that  are 

eligible  for  VN_2  ^XN-2,rN-2=1-^  ‘  In  F:*-9ure  7-24  we  show  the  candidate 

functions  (of  x^^)  Vn-2  ^or  i®*-'2' •  •  •  for  this  problem?4  These 

functions  of  x„  _  coincide  with  solutions  of 
N-Z 


UN-2  {  VN-2(XN-2'rN-2  L)} 


XN-1  6  AN-l(l)  =  (YN-l(i"1),Y  N-l(l)) 
for  those  x„  „  values  where  the  resulting  x„  ,  is  in  the  interior 


As  derived  in  Chapter  6  and  sections  7.1  -  7.4 
By  Facts  7.1  -  7.3 
That  is,  entering  form  r»2 

} 

Fig.  7.24  corresponds  with  fig.  7.15  in  the  previous  section 


The  following  relationships  that  are  pictured  in  Figure  7,24  are  verified 


in  Appendix  C.15: 


v1^  =V7'U<V3'U=  V5'U<V4'U  at  all  x 

N-2  N-2  N-2  N-2  N-2  N-2 


(7.106) 


v2'?  >  v1'? 

N-2  N-2 


V6'U  >  v7'u 

N-2  N-2 


for  all  x„  .  except 

N“  Z 

(2)  / 

equality  at  xN_2  =  0N_2/a(1) 

for  all  ^N_2  except 

equality  at  /aU)  * 


(7.107) 


(7.108) 


In  addition  to  the  seven  candidate  cost  functions  that  are  shown  in 

Figure  7.24,  there  are  two  other  condidate  functions  that  are  eligible 

for  V„  _ (x„  _,r„  =1)  ,  according  to  Proposition  5.2. 

N-  2  N-  2  N-  2 


The  functions  are 


VN-2(XN-2'V2"1) 


driving  xN-1  to  -a 
(right  boundary  of  (3)) 


,5, 

N-2 


(x 


N-2 


V2=1) 


driving  x„  .  ho  a+ 

N~1 

(left  boundary  of  dN_1(5))  • 


These  costs  result  from  driving  x^  ^to  the  lower  cost  side  of  a 


VN_1(xN_^|rN_2=l)  discontinuity.  The  parameters  for  V3^2  and  V^2 


are  listed  in  Appendix  C.15. 

We  note  that,  by  definition: 


3,R  3,U 
N-2  N-2 


except  equality  at 


V2  -  9H-2(3»/a<1> 


»/• 


N-2  N-2 


except  equality 


at  =  0„ 


!'/ 


a  (1) 


(7.109) 

(7.110) 


3  R  5  L 

if  we  superimpose  the  curves  of  VN'2  and  VN'2  on  Figure  7.24  and  compare 

the  regions  of  xN_2  validity  of  each  curvejwe  can  obtain  (as  in  Section 

7.2)  the  optimal  expected  cost-to-go,  vn-2 ^XN-2'rN-2=1^ '  at  each  value 

3  R  5  L 

of  x„  However  V  '  _  and  V.  '  can  be  in  three  different  graphical 
N— 2  N— 2  N—  2 

positions  relative  to  the  V^2  (i-l,...,7)  of  Figure  7.24,  depending 
upon  the  values  of  problem  parameters.  We  will  examine  each  of  these 
three  possibilities  in  turn. 

We  first  refer  to  Figure  7.24.  The  functions  V.V ^  and  are 

N— 2  N— 2 

(by  Fact  7.3)  lower  bounds  on  V  „  (xx  .r  =1) .  Therefore  they  will 

N-2  N— 2  N-2 

be  optimal  over  their  entire  domains  of  validity.  That  is. 


VN-2(XN-2'rN-2-1) 


^N-2  f°r  XN-2  <  ®N-2^/&^ 


(7.111) 


VN-2 (XN-2/rN-2~1)  VN-2 


f°r  V2  >  0N-2(?)/a(1)  •  (7.112) 


/2  u 
a(l),  the  candidate  cost  V^2  will  be 

optimal  until,  at  some  xN_2  >  ®N_2 d) /a (D ,  vn-2  intersects  another 

U 

a  (1)  ,  V^2  will 

.  I  6,U 

be  optimal  until,  at  some  x^^  <  ®N_2 (7)/a(l),  VN_2  intersects  another 
valid  eligible  candidate  cost.  We  know  from  Proposition  5.3  that  the 
optimal  controller  mapping. 


XN-l(XN-2,rN-2=1) 


must  be  monotone.  We  also  know  (from  Proposition  5. l) that 

V  „ (x  „,r  „=1)  is  continuous  in  x„  _.  Thus  we  are  left  with  the 

N-2  N-2  N-2  N-<£ 

following  question: 


How  does  V  (x  _,r,_  =1)  get  from  the  V2'^  curve  to  v4'^  and 

N“2  N-2  N-2  N-2  N— 2 

then  from  the  V4^2  curve  to  '  2  while  satisfying  the  monotonicity 

and  continuity  requirements  mentioned  above?  The  three  ways  to  super- 
3  R  5  L 

impose  V  '  and  V  '  on  Figure  7.24  each  correspond  to  a  different 
answer  to  this  question.  They  are  illustrated  in  Figure  7.25  -  7.27. 
We  will  examine  each  possibility  in  turn. 

The  first  possibility  (shown  in  Figure  7.25)  results  in 


V  „  (x„  _,r  =1)  having  (the  maximum  allowable  number)  in  .(1)  =g 

N-2  N-2  N-2  N-2 

pieces.  Each  piece  corresponds  to  a  different  active  hedging  strategy 


using  u,  _  and  u„  . .  This  case  is  analogous  to  Figure  7.16  for  the 
N-2  N-l 

commensurate  goals  problem  of  the  previous  section.  The  switching 
I.  L  1  R 

regions  (x„  „  6  S' „  and  x„  _  6  S'.)  are  divided  into  three  parts: 

N-2  N-2  N-2  N-2 


•  for  xN_2  6  (6^(3),  <5n.2(4))  (5N-2(5)'  6N-2(6)) 

the  optimal  controller  uses  u^^  to  immediately  hedge-to-a-point 
(to  x^.^  ■  -a  or  xN_1  =  a+) .  This  keeps  the  probability  of 
failure  at  time  N-l  (i.e.,  the  probability  that  r  ^  =  2)  low. 
Then  is  used  to  drive  x^  into  the  high  risk  piece  (-a, a) 

of  p(l,2:k).  This  increased  risk  at  time  N  is  compensated  for 
by  making  x„  near  zero,  so  that  the  performance  ost 

4  [8< V  +  W1 

is  small  for  both  r„  =  1  and  r„  =  2. 

N  N 

*  for  xN_2  6  (5n_2(2)'  5N-2(3))  and  (6N-2(6)'  6N-2(7))  the 
optimal  controller  keeps  lxN-1l  >  a  without  hedging- to-a-point 

with  uN_2-  Then  uN_^  is  used  to  make  [x^l  <  a  without  hedging- 
to-a-point. 
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Figure  7.25:  vN_2(xN-2'rN-2=1)  in  ^  first  situation.  The  optimal 

is  indicated  by  the  heavier  line.  The  resulting  optimal 

values  of  x  and  x„  ,  for  each  of  the  9  pieces  of 
N  N- 1 


VN-2(V2'rN-2=1)  are  alS°  labelled- 


Situation  1 
(Fig.  7.25) 


Situation  2 
(Fig. 7. 26) 


Number  of  pieces  of 

Vl'Vl''.-!-11 

9 

7 

Vi'Vi'Vr11 

(maximum 

possible) 

'  "N-2ll) 

Endpiece  functions 

VN-2(l!D 

VN-2(l!l) 

VN-2(9!l) 

Vn_2(7:1> 

Middlepiece  function 

V2(5s1) 

VN-2(4:1) 

Other  pieces  that  do 

not  involve  hedging- to- 

—  4- 

VN-2(3:1) 

— 

-a  a 

V2(7;1) 

Pieces  involving 

hedging- to- a-pa int 

VN-2(4!l) 

VN-2(3:1) 

with  Vi 

VN-2(6;1) 

VN-2(5:1) 

Pieces  involving 

hedging-to-a-point 

VN-2(2:1) 

VN-2(2:1) 

with  u  ,  ,  (but 

N-l 

not  with  u  ,  „) 

VN-2(8!l) 

VN-2(6:1) 

TABLE  7.4: 

Comparison 

of  Three  Possible 

Situations 

f°r  VN-2(XN-2,rN-2 

Situation  3 
(Fig. 7. 27) 


£°r  *N-2  e  Wn-2<1)'  5N-2(2)>  “d  (SN-2(7)'  W811  the 
optimal  controller  keeps  [xN_^ |  >  a  without  hedging-to-a- 

point  with  uN_2-  Then  uN_^  is  used  to  hedge- to-a-point  (to 


x.,  =  -a  or  x..  =  a  )  .  That  is ,  for  these 


values  the 


optimal  controller  does  not  let  xN  enter  the  high  risk  piece 
(-a, a)  of  p(l,2:x).  It  is  better  at  time  N  to  stay  on  the 
low  failure  probability  sides  of  the  ptl,2:k)  discontinuities 

The  second  possibility  (shown  in  Figure  7.26)  results  in 

VN-2  kCN-2,rN-2~1^  having  m^^d)  = 7  pieces.  Unlike  the  first  situation 

of  Figure  7.25,  there  are  no_  XN_2  va^ues  from  which  the  optimal 

controller  causes  x„  and  x„  to  be  in  different  p(l,2:x)  pieces  without 

N-l  N  - 

using  hedging- to-a-point.  Here  the  switching  regions  (S^2  and  S^J2) 

of  x„  .  values  axe  each  divided  into  two  parts: 

N-2 

•  for  xN-2  6  (5N-2(2)'  5N-2(3))  ^  (5N-2(4)'  5N-2(5)) 

the  optimal  controller  uses  uN_2  to  keep  xN_^  on  the  low 

probability  side  of  the  p(i,2:x)  discontinuity.  That  is, 

uJJ_2  is  used  to  hedge- to-a-point  x^_^  =  -a  or  xN_^  =  a+. 

2 

Then  ^N_1  keeps  x^  inside  (-a, a)  making  xN  small, 

*  f°r  XN-2  6  (6N-2(1)'  5N-2(2>)  and  (<5N-2(5)'  5N-2(6))  the 
optimal  controller  keeps  lxN-1l  >  a  without  hedging-to-a- 

point  with  u^^.  Th®11  used  to  keep  XN  outside  the 

high-risk  piece  of  p(i,2:x)  by  hedging-to-points 

“  4* 

x„  *  or  x__  -  a  . 

N  N 


In  the  third  possible  situation,  as  shown  in  Figure  7.27,  the 


optimal  controller  has 


m  U>  -  5  *  m  Cl) 

N-2  N-l 

pieces.  In  this  situation  either  xN  and  xN  are  in  the  same  p(l,2:x) 
piece  (that  is,  x^_^  is  in  the  domain  of  an  endpiece  or  middlepiece  of 
V„  _ (x„  _,r„  _=1)  or  else  the  optimal  controller  hedges-to-a-point 
at  time  N,  using  uN  ^  to  obtain  x^  =>  -a  or  =  ct+).  In  this  situation 
there  is  no  immediate  hedging-to-a-point  (using  uln_2 ) •  This  is  in 
contrast  to  the  third  situation  of  the  commensurate  goal  problem  (Figure 
7.28)  where  the  only  hedging-to-a-point  is  immediate. 

In  table  7.4  various  aspects  of  these  three  situations  for 
VN  2  (XN  2,rN  2=1^  are  summar^zet^*  We  want  to  relate  these  various 
active  hedging  strategies  to  the  values  of  the  problem  parameters. 

From  Figures  7.25  -  7.27  we  can  obtain  the  following  graphical 

in  this  problem. 

Fact  7.9:  For  the  problem  (7.2)  -  (7.5)  with  fact  7.1,  7.2,  7.3(2) 
and  (7,14)  holding,  we  have  the  following: 


conditions  relating  to  the  three  possible  shapes  of  VN_2 ^xu-2,rN- 


(1) 


(2) 


Situation  (1),  as  in  Figure  7.25,  occurs  if  and  only  if  the 

2  U 

rightmost  intersection  of  the  two  quadratic  functions  V  ^ 

and  v?/?  is  to  the  left  of  x„  „  *  0.,  „(3)  / a(l). 

N-2  -  N-2  N-2  ' 


Situation  (3),  as  in  Figure  7.27  occurs  if  and  only  if  the 


rightmost  intersection  of  V^'^(x„  »)  and  V^'«(x„  .)  occurs  to 

N-2  N-2  N-2  N-2  — 


the  right  of  (or  exactly  at)  the  rightmost  intersection  of 

)  . 


„3 ;  R ,  .  m  f4  f  U  . 

Vi'Vj’  and  VN-2(V2 
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(3) 


Situation  (.2),  as  in  Figure  7.26,  occurs  when  (1)  and  (2) 


above  are  both  not  met.  A  necessary  condition  is  that 

VH-2<V2)  and  VN-2(V2>  intarse<:t- 


Proof:  Immediate  from  Figures  7.25  -  7.27. 


In  appendix  C.15  the  values  of  the  rightmost  intersections 

*  ,„2,U  .  „3,U,  ,„2,U  .  „4,U.  ,„3  ,R  .  „4,U,  ./„3, R 

of  (V„  „  and  V„' J ,  (V„/0  and  VN_2>  ,  (V^2  and  VN'2)  and\V^2 


N-2 


N-2 


N-2 


N-2 


N-2' 


and 


and  the  value  of  0.,  _  (3)  /a(l)  are  listed.  From  Fact  7.9  and 
N-2  N-2  ' 

these  values  we  obtain: 


Fact  7.10:  For  the  problems  of  Fact  7.9, 


(1)  situation  (.1),  as  in  Figure  7.25,  occurs  if  and  only  if 


a(l)  <  (l  +  -Jl-  X3  )  (7.113) 

where  X^  is  given  by  (C.15. 13). 


(2)  situation  (3),  as  in  Figure  7.27,  occurs  if  and  only  if 


2 

a(1)  >  (1  +  V2>>  (1  ^2)  X6  (1 


-Vi  -  x  ) 


(l 


(3) 


(7.14) 


R  +  b 


^-1 


(4) 


where  X  is  given  by  (C.15. 27)  and  X  by  (C.15. 28). 
5  6 


(3)  Situation  (2),  as  in  Figure  7.26  occurs  if  and  only  if 

(7.113)  and  (7.114)  are  both  not  true.  A  necessary  condition 
is  that  1  where  is  given  by  (C.15. 20). 


Proof;  These  conditions  are  derived  in  Appendix  C,15,  0 

Conditions  (7.113)  -  (7.114)  are  implicit  relationships  that  can 
be  tested  for  any  given  set  of  problem  parameters.  We  can  use  Facts  7.1, 
7.2  and  7.3(1)  to  obtain  sufficient  conditions  that  are  more  easily 
analyzed,  similar  to  those  obtained  in  Fact  7.6  for  the  three  cases  of 
the  commensurate  goal  problem.  In  particular  we  cam  use  (7.113), 
(C.15.13)  and  condition  (7.14)  to  obtain  the  following  sufficient 
condition ; 


Fact  7 . 11  For  the  problem  of  Fact  7.9,  situation  (1)  occurs  if 


*(1>  <  i1  *  tuT"  V21) 


- 


(R(l)  +  b  (1)  K^d)]. 


1\ 


( R  ( 1 )  +b  (1)  [(1-C02)Q(1)-HU2(Kn_1(1:2)+Q(2)] 


J  rtR(l)+b2  (1)^(2)]  ' 

VR(l)+b2(l)  [(1-C02)Q(1)-K02C(2)+Kn_1(1:2)]]> 

(7.115) 

Another  weaker  sufficient  condition  for  situation  (1)  is  that 


a(l)  <  (1  +  Kjj (2) )  ^1  - 


1  - 


R(l)  +  b  (1)  K^d) 


R(l)  +  b2  K^d) 


) 


R(l) 

where  (1)  is  given  by 

K^d)  =  d  -  0^)  (Qd)  +  «£”(!))  +  CO^Qd)  +  (1:2) ) 


(7.116) 


(7.117) 


for  K  (1:2)  =  lim  K.  (1:2)  as  given  in  (6.98) 
(N-k)-*» 


and  K^d)  =  lim  K?e(l) 
(N-k)-*00  * 


as  given  in  (6.107)  . 


Proof :  See  Appendix  C.15. 


a 


As  in  the  commensurate  goals  case  (Facts  7.5,  7.6),  for  the 
conflicting  goals  case  under  study  here  the  structure  of  VN  2 ^xn  2,rN  2=* 

can  be  related  to  the  value  of  a(l)  (that  is,  to  the  stability  of  the 
open  loop  dynamics  in  form  r=l) .  For  sufficiently  small  a(l),  where 
(7.113)  is  satisfied,  we  have  the  optimal  controller  of  Figure  7.25.  For 
larger  a(l)  (satisfying  neither  (7.113)  nor  (7.114))  the  optimal 
controller  must  hedge-to-a-point  with  u  or  u  ,  for  all  x„  that 
are  not  in  the  endpieces  or  middlepiece  domains.  This  is  the  case 
shown  in  Figure  7.26.  For  sufficiently  unstable  a(l)  (satisfying  (7.114)) 
the  optimal  controller  does  not  hedge-to-a-point  until  the  last  possible 
time.  This  is  the  situation  shown  in  Figure  7.27,  For 

Xt!-2  €  t5N-2(1)'  ^N-2(2^  and  xn-2  6  ^N-2*3*'  ^N-2 1  1:116  °Ptimal 
controller  hedges  to  either  x^  =  a+  or  x^  =  a  with  control  u^_^ . 

Comparing  the  large  a(l)  case  for  the  conflicting  goals  problem 
of  this  section  and  the  commensurate  goals  problem  of  section  7.5. 

(that  is,  comparing  Figures  7.18  and  7.27),  we  observe  a  basic  difference 
in  the  nature  of  active  hedging  in  these  two  cases.  In  both  problems, 
for  large  a(l)  the  optimal  controller  either 

•  always  puts  x  inside  (-a, a) 

(the  middlepiece) 

or 

•  always  puts  x  outside (-a, a) 

(the  endpieces) 

or  it 

•  hedges  to  the  advantageous  side  of  a  p(l,2:x)  discontinuity 
at  one  (and  only  one)  time. 


For  the  commensurate  goal  problem,  this  hedging-to-a-point  is  immediate, 
using  uN  to  drive  x^  to  a  or  -a+.  For  the  conflicting  goals  problem 
the  optimal  controller  doesn't  hedge-to-a-point  until  the  last  possible 
time,  using  ^  to  drive  xJJ  to  ct+  or  -a  . 

It  is  easy  to  see  that  hedging-to-a-point  must  be  done  quickly  in  the 
commensurate  goals  case,  since  if  a(l)  is  large  it  tends  to  drive  the 
system  away  from  the  desirable  part  of  the  x  axis  (with  respect  to  both 
the  performance  and  reliability  goals) .  Spending  extra  control  energy 
to  drive  x  inside  (-a, a)  leads  to  savings  in  the  performance  cost  at 
all  future  times,  and  this  strategy  also  reduces  the  likelihood  of  the 
undesirable  form  transition  (from  form  1  to  form  2) . 

In  the  conflicting  goals  case,  if  a(l)  is  large  it  causes  the  system 
to  move  away  from  the  performance  goal  (i.e.,  the  origin),  but  it  also 
tends  to  drive  the  system  into  the  advantageous  p(l,2:x)  piece  (or 
keeps  it  there) .  Hedging-to-a-point  decreases  the  probability  of  a  bad 
transition  but  it  results  in  a  much  larger  operating  cost  at  all  future 
times  (since  a(i)  is  large) .  Thus  it  is  not  desirable  to  hedge-to-a- 
point  until  there  "isn't  much  future  left"  -  that  is,  at  the  terminal 
time. 

In  the  remainder  of  this  section  we  will  examine  the  optimal  JLQ 

controller  for  problems  satisfying  the  sufficient  condition  (7.116) 

of  Fact  7.11.  For  these  problems  we  find  that  at  all  times  (N-k) , 

the  optimal  JLQ  controller  follows  the  pattern  of  figure  7.25.  This 

allows  us  to  obtain  a  recursive  description  of  v  ,  (x,,  .  ,r„  .  =1)  and 

N-k  w-k  N-k 

Ujj  k=^  at  eack  ti®8  (N-k)  for  this  class  of  conflicting  goals 

problems . 


A  typical  V  (x  , r  =1)  curve  is  shown  in  figure  7.28.  It  has 

N“K 

^(1)  =»  4k+l  pieces,  and  is  symmetric  about  x^  =  0. 

For  xN_k  in  the  middlepiece  (i.e.,  for  6  xN_k  <  fc(2k  +  ^ 

the  optimal  controller  will  result  in  |x _kUI  <  a  for  t  -  1.2,. ..k. 

That  is,  the  x  process  will  be  kept  in  the  high-failure-risk  (but  low 
cost)  piece  of  p(i,2:x)  at  all  future  times. 

The  first  piece  to  the  left  of  the  middlepiece  (i.e., 

6N_k (2k-l)  <  x^_k  <  <5N_k(2k))  corresponds  to  using  u^  to  achieve 
x[}_k+^  =  -a  .  That  is,  we  use  the  control  ^  to  keep  x^  ^  out  of 

the  high-risk  piece;  we  hedge  to  the  low  risk  side  of  -a.  Then  at  the 
next  time  (for  k  >  2) ,  the  control  u  will  drive  x„  ,  _  into  the 

high-risk  -  low-cost  middlepiece.  The  x  process  will  then  be  kept 
inside  f-a,0]  through  time  N. 

The  second  pieces  to  the  left  and  right  of  the  middlepice  correspond 
to  having  lxN_jc+1l  >  a  but  <  a  for  ^  **  2,3,...k  without 


hedging-to-a-point . 

The  third  pieces  of  'rN_k=1)  to  the  left  and  right  of  the 

middlepiece  correspond  to  hedging-to-a-point  (to  -a  or  a+)  one  time 
stap  in  the  future;  that  is,  u^ 

Then  the  x  process  is  kept  inside  (-a, a)  from  time  (N-k)  +  3  through  N. 

In  general  the  2mth  pieces  of  vN_k (xN_k»rN_k=13  to  the  left  and 
right  of  the  middlepiece  correspond  to  never  hedging-to-a-point.  The 
x  process  will  be  in  a  low  failure  probability  piece  of  p(l,2:x)  for 


k+1  is  used  to  obtain  |xN_k+2 


XN-k+l'  "*'XN-k+m  and  then  W1H  1:36  inside  the  high  failure  probability 


region  (-a, a)  from  time  (N-k)  +  m  +  1  through  time  N.  Note  that  the 


2k  pieces  of  VN_k(xN_k,rN_k=l)  t0  tl>e  left  and  of  xN_k®°  are 

the  endpieces  of  the  optimal  cost.  For  x^_k  C  sN_ku>  end 

x.,  ,  >  <$„.  ,  (4k)  ,  the  optimal  controller  does  not  drive  the  x  process 
N-k  N-k 

inside  the  high  risk  piece  (-ct, ot)  of  p(l,2:k)at  any  future  time, 
st 

The  (2m+l)  pieces  of  V  ,  (x  ,  ,r  ,  =1)  to  the  left  and  right  of 

N— K  N—JC  N— K 

xN_k  =  0  correspond  to  using  u^_k+m  to  hedge  to  xN_jc+m+1  ®  “a  or 

Vk+m+l  =  a+*  Then  K-k-J  "  a  f0r  £  =  m+2 . k‘ 

As  in  the  commensurate  goals  problem  of  Section  7.5,  this  class  of 
conflicting  goal  problems  illustrates  that  using  control  to  alter 
failure  probabilities  at  future  times  is  directly  reflected  in  the 
expected  cost-to-go.  These  two  example  classes  motivate  a  finite-time- 
horizon  approximation  to  the  infinite  time  horizon  problem  (for  the 
general  problem  of  Chapter  5)  that  is  developed  in  the  next  section. 

The  following  Proposition  states  the  general  result  for  the  problem 
of  this  section.  We  again  use  the  shorthand  notation  of  (7.39). 


Proposition  7.12 

For  the  problem  of  (7.2)  -  (7.5)  with  facts  7.1,  7.2,  7.3(2)  and 
equation  (7.116)  of  fact  7.11  holding,  the  optimal  JLQ  controller  can 
be  completely  described  as  follows: 

1)  The  number  of  pieces  of  Vk  'Vk'W11  ’WVk'Vk"11  “ 
time  (tf-k)  is 


*  4k  +  1.  (7.118) 

2)  This  number  of  pieces  is  the  maximum  possible.  That  is,  all  eligible 
candidate  cost-to-go  functions  (by  Proposition  5.2)  are  optimal  over 
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some  portion  of  their  regions  of  validity.  These  eligible 


candidate  costs  are 


t  =  1 ,  .  . . ,  4k  -  1 


(driving  into 


AN-k+l(t) ] 


T2k-1,R 


(hedging  to  x^^  =  -a”  ) 


,2k+l,L 


(hedging  to  xN_k+1  =  a+) 


Vk'Vk'Vk111  Vk(Vk'rN-k=1)  are  symmetric  about 


x„  ,  =0 .  That  is , 
N-K 


Vk(i!l) 


=  VN.k(4k  ♦  2  -  i :  1), 


(7.119) 


Vk(i!l) 


=  u(4k  +  2  -  i : 1) 


I  Vk  58  ’x 


|XN-k  = 


(7.120) 


5N-k(1)  "  -Vk(4k  +  1  "  15 


(7.121) 


for  i  *  1,2,..., (2k  +  1) . 


4)  The  closest  grid  points  to  zero  in  the  composite  x(J_k+1  partition 


sure  +  a.  That  is, 


YN-k+lU) 


Vk+l(i) 


i  =  1, . . . ,2k-2 


YN-k+l(2k  1} 
YN-k+l(2k)  “  a 
YN-k+l{^  *  ^N-k+l^”2^ 


(7.122) 


j  =  2k+l , . . . , 4k-2 
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The  o_dd-numbered  pieces  of  Vk^-n'W11  and  Vk1 'Vk' 'Vi'1’ 
are  given  by 

Vk(i:1)  =  Vk2  Vk(i:1)  (?-123) 

Vk(i:D  =  ’LN-k(i:1)  Vk  <7-124> 

optimal  for 

<5  ,  (i-l)  <  x  <6  (i) 

i  =  1,3,5,. .. , (4k+l)  « 

These  costs  are  optimal  for  x^  ^  values  where  the  best  strategy  is 
not  to  use  any  of  the  values  u^^, . . .  ,UN_1  to  hedge-to-a-point. 

They  include  the  left  endpiece 

-  Vk*1'1'- 

the  middlepiece 

Cc(11  -  ^k'11  -  Vk(a  *  l!l>  • 

and  the  right-endpiece 

'Ck111  *  Vk(4k  +  i;l)  . 

The  even-numbered  pieces  of  V^***.!**-!)  and  Vk<  Vk'Vk’11 
are  given  by 


Vk(i:1)  =  Vk2  Vk(i:1)  +  Vk  Vk(i:1)  +  Vk(i:1)  (7'125) 


Vk(i:l)  "  -LN-k{i:1)  Vk  +  Vk(i:1) 

optimal  for 


(7.126) 


Wt!l1  i  *N-k  i  V-k(i) 


i  *  2 .4,6, ...  ,4k  m 


These  costs  correspond  to  actively  hedging-to-a-point  at  one 
(and  only  one)  future  time.  Specifically,  at  each  time  (N-k)  <  N, 
and  for  each  i  =  l,2,...,k: 

V  (2i;l)  is  the  cost  associated 

IN  "“JC 

with  using  Control  uN_^  to  hedge  to-a-point.  That  is,  it  is  the 
expected  ccst-to-go  which  results  if 

'  VkU-i  keeE,s  •= 

for  each  1  £  %  <_  k-i 
and  then 

•  u  .  hedges  to  xv,  .  , 

N-i  ^  N-i+l 

and  then 

VW-1  keapS  -a  k  V***  *  ° 

for  each  k-i+2  Jl  _<  k  (when  k  2)  . 

So  V (2k: 1)  corresponds  to  hedging-to-a-point  immediately 
(using  u^_^)  and  VN_^(2:1)  corresponds  to  hedging-to-a-point 
at  the  last  time  (using  uN  ) . 

Similarly  for  vN_j^  ( j  where  j  =  2k  +  2,  ...,4k,  the 
optimal  controller  uses 


-a 

(when  k  ^  2) 

=  -a 


to  hedge  to 


) 


The  cost-parameters  in  (7.123)  -  (7.126)  are  given  recursively 


by  the  following  set  of  coupled  difference  equations: 


*  In  form  r=2,  for  k  =  N-l,  N-2,...,  the  parameters  K^(l:2), 

L  (1:2)  are  specified  by  (7.49)  -  (7.51)  (as  in  the  analogous 
commensurate  goal  problem  of  Proposition  7.7), 

*  For  k  =  1 , . . .  ,N  the  middlepiece  parameters  ^(2k+l:l) , 

Ln  ^(2k+l:l)  are  given  by  (7.52)  -  (7.54), 

UB  1 

Here  k(2k+l:l)  =  »  the  upperbound  parameter 

*  For  k  =  1,2,...,N  and  i  =  1,2, . . . ,2k-l, 

^(i:l)  and  ^(i:l)  are  given  by  (7.55)  -  (7.58) 


*  for  k  »  1,2, ... ,N  the  immediate-heaging-to-a-point  para¬ 
meters  K  ,  (2k :1)  ,  H  ,  (2k :1) ,  L  .  (2k:l)  and  F  ,  (2k :1) 

N-k  N-k  N-k  N-k 

are  given  by  (7.59),  (7.60),  (7.62)  and  (7.63),  respectively. 


Unlike  Proposition  7.7,  G^  (2k :1)  is  given  by 


Vk(2k!l)  -  ;r;7  tR(1)  +  b2<11  Vk+i(2k-1,IJ 

b  (1) 


(7.127) 


•  for  k  =  2,3,...,N  and  i  =  2,4,...,2k-2 

the  parameter  H  (i : 1)  ,  G  (i:l)  and  F  (i:l)  are  given 

IN 

by  (7.64)  -  (7.67). 

)  By  symmetry  we  have  (for  each  k  =  1,2,...,N),  for  i  =  ,2,..., 2k 

the  relationships  (7.72)  -  (7.76). 

Unlike  the  commensurate  goals  problem  of  Proposition  7.7,  where 
LB 

K^_^(2k+l:l)  =  K^^d)  (the  lower-bound  parameter). 


The  joining  points  {6^_^(i)  :  i  =  l,...,4k}  are  given  by 


for  k  =  1,2, ... ,N 


5  .  (2k)  = 

N-k 


-a 


a(l) 


i1  + 


b2(l) 

R(l) 


V 


k+1 


(2k) 


R (1)  +  b2(l)  K^^dk-l) 
R(l)  +  b2(l)  Vk+l(2k) 


(7.128) 


•  for  k  =  1,2,...,N  and  i  =  l,2,...,k 


N-k 


(2i-l)  = 


a(l) 


R  (1)  N-i+1 


\  k-i 

(2i-l)  n 
/  5,=1 


2  ^ 

^1+  R  (1)  KN-k+«.(2l-1) } 


a  (1) 


(7.129) 


*  for  k  =  2,3, ...,N  and  i  =  l,2,...,k-l 


W2i)  =  -HN-k(2i:1)  VVk(2i:1)  - 4 


Vk(2i:1) 

-Vk(2i+1:1) 


W2i:1) 


2[KN-k(2i:1)  -  Vk(2i+1:1)1 


. Z2L.  (i  +  -XLI  v  / 2 i) ) 

a  (1)  U  R ( 1)  N-i+1  21 


k-x  (1+  Itit W2i+1)) 

n  - 


i=l 


a(l) 


.  (l  -  A  -  \(D)  ) 


where  x,(i)  is  given  by  (C.16,8). 


(7.130) 


Here  we  define 


9 

n  =  i 

t=l 


if  9  <  1 


10)  At  each  time  1c  =  1,2,...  we  have  the  following  relationships: 


Vlc(l:l)  <  »•••<  Vk{2k"1:1)  <  Vk(2k+1:1)  <  KN-lc(2k:1)J 


odd-indexed  parameters 


(7.131) 


Vk(2k-i:D  <  Vk(2:1)  <  <  Vk(2k:1L 


even-indexed  parameters 


(7.132) 


Vk(2i"l5l)  <  6N-k(2i:1)  <  Vk(2i+1:1) 

x  3  1, . . . ,k 


(7.133) 


vVk(1)  <  •••  <  1W2k-1)  '  W2**11  <  W15  (7-134) 


odd-indexed  parameters 


Vk(2k-1)  <  Vk<2>  <  ...  <  Vk<2K>, 


(7.135) 


even-indexed  parameters 


W2k+1)  <  KN-k(2k+2) 


(7.136) 


438 


!W2k+1'1>  -  Vk(1> 

11)  For  i  =  1, i, . . . ,k 

VN-k(2i:1)  >  VN-k(2i"1:1)  except 
equality  at  XN_k  =  5N_fc (2i-l) 

and 

6  _  (21)  is  the  rightmost  intersection  of 

V  (2i:l)  and  V  (2i+l:l) 

N-k  N-k  q 

Proof:  The  proof  of  this  proposition  involves  an  induction  on  (N-k) , 

starting  with  k=2 .  The  proof  is  developed  in  appendix  C.16. 

For  this  problem  the  optimal  control  law  and  expected  cost-to-go 
at  each  time  k  =  kQ,...,N  (for  any  finite  time  horizon  problem)  can 
be  computed  recursively  from  a  growing  number  of  difference  equations 
running  backwards  in  time.  As  with  the  commensurate  goals  problem  of 
Proposition  7.7,  for  this  conflicting  goals  problem  we  need  not  follow 
all  the  flowchart  comparisons  and  tests  of  Section  7.2.  This  problem 
thus  lends  itself  to  detailed  analysis  and  interpretation. 


=  Vk(1)  and 

.LB 


=  Vl 


.  (1) 


In  particular  we  can  see  what  each  piece  of  the  controller  is  trying 
to  accomplish.  Refer  again  to  figure  7.28,  where  a  typical  V  (x  ,r  =1) 


is  shown.  The  middlepiece  is  the  upperbound  cost  function  V  (1)  , 
associated  with  always  having  x  in  the  p(l,2:x)  region  where  the  pro¬ 
bability  of  failure  is  high.  The  endpieces  coincide  with  the  lowerbound 
cost  function  (1) ,  since  the  controller  never  drives  x  into  the  high 
p (1 , 2 :x)  region  if  <  <5N_k(l)  or  ^N-k^45^’  T^e  even~nuinl::>ered 

pieces  correspond  to  hedging- to-a-point  at  successively  further  times  in 
the  future  (as  we  move  away  from  the  middlepiece  in  figure  7.28),  with 
successively  higher  costs  being  incurred. 

In  figure  2.29  the  optimal  control  law  u^  (XN-k,rN  k.=^  shown 
for  these  problems.  Note  that  this  control  law  is  discontinuous  at  the 
joining  points  {<5  (i)|i  =  2,4,. ..,2k},  where  not 

differentiable.  In  figure  7.30  the  optimal  mapping  from  to  x^^^ 

(given  rN  ^  =  1)  is  graphed.  There  is  a  region  of  avoided  xN_k+1  values 
associated  with  each  control  law  discontinuity.  Thus  at  all  times  (N-k) 
we  have  the  type  of  behavior  illustrated  at  time  (N-l)  in  Fact  6.10. 

Let  us  now  consider  the  JLQ  optimal  controller  of  Proposition  7.12 
as  the  time  horizon  (N-k)  grows  large.  From  (7.118)  we  see  that  the 
number  of  pieces  of  the  optimal  controller  grows  without  bound  as  (N-k) 
goes  to  infinity.  Thus  the  exact  infinite  time  horizon  optimal  controller 
is  not  obtainable  precisely  via  any  finite  algorithm.  However,  we  can 
obtain  a  description  of  the  Proposition  7.12  controller  as  (N-k)  grows 
large  which  is  similar  to  that  given  in  Proposition  7.8  of  the  previous 
section.^-  As  (N-k)  grows  large,  many  of  the  controller  pieces  correspond 
to  moving  the  state  from  one  p(l,2:x)  piece  to  another  at  a  time  far  in 


describing  the  Proposition  7.7  controller  as  (N-k)  -*■ 


Figure  7.30;  Optimal  xN_k+^  given  xN_k  and  rN_)c  = 
problems  of  Proposition  7.12. 


1  for  the 


the  future.  The  difference  between  any  of  these  pieces  and  the  endpiece 
cost  function  becomes  small  as  the  time  when  this  change  is  effected  becomes 


distant.  Consequently  the  structure  of  the  optimal  controller  converges 
to  a  steady-state  controller  which  we  can  approximate  with  arbitrarily 
small  error  by  choosing  suitably  large  (N-k) . 

The  discussion  which  precedes  Proposition  7.8  in  Section  7.5  is 
applicable  here  with  the  exceptions  that 

•  hedging-to-a-point  is  to  -a  and  a+  (instead  of  to  -a+  and  a  ); 

•  the  endpiece  cost  functions  are  lower  bounds  on  V„  ,  (x„  ,  ,r„  ,  =1) 

-  N—K  N— K  N-K 

and  odd-indexed  pieces  V  (i:l)  (i=l,3,5, . . .)  approach  V^®^1)  from 
above1,  at  each  x.  The  even- indexed  pieces  VN_^(i:l)  (i=2,4,6, . . .) 
approach  it  from  below. 

As  in  section  7.5,  it  is  easier  to  discuss  the  structure  of  the 
limiting  optimal  controller  (as  (N-k)  -**>)  if  we  count  pieces  from  the 
middlepiece  outwards.  Using  this  indexing  method  (as  described  prior 
to  Proposition  7.8  and  as  shown  in  figure  7.22),  we  can  summarize  the 
structure  of  the  limiting  controller  for  the  problem  of  7.12  as 
follows: 

Proposition  7,13;  For  the  problem  of  Proposition  7.12,  as  (N-k)-*  « 
items  (1)  -  (7)  of  Proposition  7.8  hold  except  that  (7.90)  is  replaced 
by 

1 


instead  of  from  below,  as  in  section  7.5 


<+l> 


G  <+!>=  lim 


(N-k)-** 

a2 


N-k  - 


b2(l) 


R(l)  +  b  (1) 


( l-^2 ) (Km<0>  +  Q(l)] 


+W2  (K00(l:2)  +  Q  (2) ) 


In  addition  we  have  the  following: 


(7.137) 


1.  As  (N-k)-*00  the  joining  point 


6  <-l>  -  <5..  (2k)  converges  monotonely  to 


6  <-l>  =  lim  5  <-l> 

oo  N-k 

(N_k)-M»  N 


a  (1) 


-  £ 
1)  VJ 


1  +  ^T7T^-K^M(l)|il  -  /I 


R(l)  00 


LM, 


R  (1)  +b  (1)  [  (1-cjo)  (KT  (1)  +Q  (!)  )  + 


oj2(K  (1:1)  +Q  ( 2 ) ) 


R(l)+b2  (1)  tl-^L)  (K^M(1)+Q(D)  +T) 


(l-v  i 

L  +  “i 


(KJ1:2)+Q(2)  j| 


(7.138) 


and 


<5M  - <+l>-f  6  <+l>  =  -6  <-l> 

1  00  oo 


2.  For  k.*>l,2  for  each  fixed  i  6  {l  2f...(k}  t  the  joining 

points  $N  ^<-2i>  converge  monotonely  as  (N-k)-*°°  to 


<5  <-2i)> 

00 


6  <-2(i:l)> 
00 

a(l) 


1  + 


b2(l) 

R(l) 


(l-w2)  (^<-2(1+1)  >  +  Q(l) ) 
+  u)2(Koo(1:2)  +  Q ( 2) ) 


and  .  <2i>-*  5  <2i>  =  -<$  <-2i> 


(7.139) 


3. 


For  k  *  2,3, ...  ,N  for  each  fixed  i  S  {(1,2,.,-jk}  the  joining 
points  6  <-(2i+l)>  converge  as  (N-k)-*30  to 


6oo<-(2i+l) 


>  =  f"-H  <~(2i+2)> 

U — r 


<2i-2>)  -  4[Koo<-2i-2>  -  K  <-2i-l>]G  <-2i-2> 


2  [Koo<-2i-2>  -  K  <-2i-l>] 


(7.140) 


and 


o  <2i+l>-*>  5  <2i+l>  a  -6  <-  (2i+l)  >  . 

no 


4.  For  the  limiting  problem  solution  parameters,  as  (N-k)-*°°  the 
following  relationships  hold: 


(i)  K^d)  =  KRe(l)<-..<K  <+4>  <K  <+2>  <K  <0>  <K  <+l>  (7.141) 

oo  CO  CO—  00  —  00  00- 

(ii)  K  <+2>  < . . . <X  <+3>  <K  <■*-!>  (7.142) 

00  —  00  —  CO  — 

(iii)  5  <-2(i+l)>  «5  <-2i>  <<5  <-2(i-l)>  (7.143) 

00  00  oo 

hence 

<5  <2i+l>  -  6  <2i-l>  >0  (7.144) 

00  00 

6  <-(2i+l)>  -  6  <-(2i-l)>  >0 

00  oo 


Proposition  7.13  says  that  as  the  time  horizon  becomes  infinite,  the 
number  of  pieces  in  the  optimal  controller  also  becomes  infinite  but  each 
piece  (counting  from  the  center  outwards)  converges  to  a  constant  steady- 
state  function  that  is  optimal  over  a  constant  steady-state  interval  of 


x  values.  From  figure  7.30  we  see  that  there  will  be  certain  steady-state 


intervals  of  x  values  that  the  system  will  avoid.  From  (7.144)  we  see 

that  the  width  of  the  switching  regions  will  grow  without  bounds  as  (N-k) 

increases.  We  cannot  implement  precisely  the  steady-state  JLQ  controllers 

of  Proposition  7.13  and  7.8  using  a  finite  algorithm,  because  there  are 

infinitely  many  controller  pieces.  However,  we  can  approximate  the 

steady-state  controller  by  using,  at  each  time  (N-k) ,  the  true  optimal 

controller  for  a  certain  number  of  pieces  around  zero  and  approximating 

the  rest  of  V,  .  (x.,  ,  ,r  ,  =1)  by  the  endpiece  functions.  This  method  of 
N-k  N— k  N-k 

approximation  is  discussed  in  the  next  section. 

In  this  section  we  have  examined  the  structure  of  the  optimal  JLQ 
controller  for  a  class  of  problems  having  conflicting  reliability  and 
performance  goals.  This  problem  class  has  a  solution  structure  that  is 
particularly  amenable  to  detailed  analysis.  Its  solution  illustrates 
the  way  that  the  optimal  controller  uses  active  hedging  to  achieve 
fault  tolerance  in  this  conflicting  goals  case.  The  major  difference 
between  this  case  and  the  commensurate  goals  problems  of  section  7.5  is 
the  following: 

•  When  the  pef romance  and  reliability  goals  are  commensurate, 
the  JLQ  controller  uses  hedging-to-a-point  to  drive  the  x 
process  into  the  advantageous  probability  region  sooner  than 
the  probability  region  x-independent  JLQ  controller  (of 
chapter  3)  would. 

•  In  the  case  where  these  goals  are  conflicting,  the  JLQ  controller 
uses  hedging-to-a-point  to  keep  the  x  process  out  of  the  dis¬ 
advantageous  piece  for  a  longer  time  than  the  x-independent 

JLQ  controller  (of  chapter  3)  would. 
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7.7  Finite  Look-Ahead  Approximations  of  the  Optimal  JLQ  Controller 


The  solution  algorithm  flowchart  of  section  7.2  lets  us  ob¬ 
tain  the  optimal  JLQ  controller  for  the  general  problem  of  chapter  5.1 
This  algorithm  is  more  efficient  that  the  brute  force  aoproach 
used  in  chapter  5,  in  that  Propositions  5.2  and  5.3  are  used  to  greatly 
reduce  the  number  of  calculations  and  comparisons  that  are  needed  to  obtain 
the  optimal  controller.  However,  the  algorithm  of  section  7.2  is  burden¬ 
some  to  compute  and  difficult  to  implement  when  the  time  horizon  of  the 
control  problem  is  large.  This  is  because  the  number  of  pieces  of  the 
optimal  control  law  (in  each  form)  grows  linearly  with  the  time  horizon. 

In  this  section  we  develop  an  approximation  of  the  true  optimal  JLQ 
controller  that  requires  less  computation  and  is  easier  to  implement. 

This  approximation  is  applicable  for  any  problem  in  the  class  that  was 
formulated  in  section  5.2. 

Our  approximation  of  the  optimal  JLQ  controller  for  the  general 
class  of  problems  of  chapter  5  is  motivated  by  the  structures  of  the 
optimal  controllers  in  the  special  problems  that  were  studied  in  sections 
7.3  -  7.6.  We  will  first  develop  suboptimal  approximations  of  the 
optimal  JLQ  controller  for  these  problems.  Then  we  will  develop  a  similar 
approximation  that  applies  to  the  general  case. 

Recall  that  for  the  optimal  controllers  described  by  Propositions 
7.7  and  7.12,  the  optimal  control  laws  and  expected  cost-to-go  can  be 
computed  off-line,  backwards  in  time  via  sets  of  (growing  numbers  of) 

^  as  formulated  in  section  5.2 
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recursive  difference  equations.  In  Propositions  7.8  and  7.13,  the 

limiting  structures  of  the  optimal  controllers  for  these  problems  as 

the  time  horizon  becomes  infinite  are  specified  by  (countably)  infinite 
sets  of  difference  equations.  That  is,  even  for  these  relatively 

simple  problems,  the  optimal  controller  for  the  infinite  time  horizon 
situation  cannot  be  obtained  using  a  finite  algorithm.  However,  pieces 
of  the  optimal  controller  (for  the  controllers  of  Propositions  7.7,  7.8, 
7.12  and  7.13)  that  are  succesively  further  from  the  middlepiece  are  seen 
to  converge  to  the  controller  endpieces  (as  functions  of  x) .  This 
suggests  a  natural  approximate  controller  for  these  problems:  at  each 
time  N-k  (k  >  p,  for  some  fixed  specified  p) ,  we  apply  the  true  optimal 
controller  when  the  x  process  is  in  the  domain  of  the  (4p  +  1)  pieces 
closest  to  (and  including)  x^  =  0 ,  and  we  approximate  the  optimal 
controller  for  other  xN_k  values  by  the  endpiece  control  laws. 


Proposition  7.14  (4p+l  piece  suboptimal  controller): 


For  JLQ  problems  satisfying  the  assumptions  of  Proposition  7.7  or  7.12 
the  optimal  JLQ  controller  can  be  approximated  as  follows: 

For  p  >  1  fixed, 


*at  times  N-p, . . . ,N-1  use  the  true  optimal  control  laws  (if  in 
form  1) : 


Vp  <  Vp '  Vp'1  > . Vl'Vl'Vl'11 


(as  computed  using  the  algorithm  of  section  7.2)  ,  with  expected 
cost-to-go 


VN-p(XN-p'rN-p“1)  "••'VN-l(XN-l'rN-l=1)  '  WrN=1) 


l  i  ,  .  «■ if  rr  » ■.  r  . .  - 


rat  times  (N-k),  for  k  >  p,  if  rN_k_1' 

(i)  use  the  middlepiece  control  law 


lN-k(2k+1:1)  ‘  V 

k(1>  lf 

;N-k(2k>  <  Vk  < 

W2k  * 11 

use 

Vk(2k:1) 

Vk(2k-1)  <  Vk  <  W2kl 

Vk<2k+2:1)  if 

Vk(2k+ll<  Vk <  Vkl2k+2) 

(iii)  if  p  >  2  then  for  l  =  l,...,p-l  use 


uN_k(2(k-A)  :  1) 

if 

<$N_k(2  (k-Sl)  -1) 

<  XN-k 

<  5  N-k(2(k-£» 

1*^(2  (k-A) +1:1) 

if 

6N_k(2(k-iU)  < 

Vk  < 

Vk{2(k*u+1) 

u^_k  (2  (k+J,)  +i :  1) 

if 

6N_k(2(k+D)  < 

Vk  < 

6m  .  (2(k+Z)+l) 
N-K. 

u  .  (2(k+£)+2:l) 
N-k 

if 

6N.k(2(k+JD+l) 

<  Vk 

<  Vk(2(k+<1+1)) 

(iv)  use  the  endpiece  controllers 

Vk(1>  ■  Vk11'11  i£  v* <  W2(k-p)) 

Vk<»  ■  Vk'1W:11  if  Vk  >  Vk(2<k+P,+1)' 

The  resulting  suboptimal  controller  har  (4p+l)  pieces  at  all  times 
(N-k)  <  (N-p) .  Let  us  denote  the  expected  cost-to-go  from 
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(xN  k=l)  that  corresponds  to  this  approximate  controller  by 

V  (x  , p  .=1).  It  is  related  to  the  true  optimal  JLQ  expected 
cost  to  go  Vk'Vk'Wl’ 

(i)  For  problems  satisfying  the  assumptions  of  Proposition  7.7 
commensurate  goals) : 

Vk  KN-k(2(k"p+1,+lll>  iVk'Vk'Vk-11  IVkVk'Vk-11 

i  Vk  Vk 11,11  (7'1451 

for  xN_^  outside  (6N_  (2 (k-p) ) ,  6N_k (2 (k+p) +1) ) . 


(ii)  For  problems  satisfying  the  assumptions  of  Proposition  7.12 
(conflicting  goals) : 

*N-k  Vk(1:1)  -  VN-k(XN-k,rN-k=1)  -  VN-k(XN-k,rN-k=!l)  - 

1  Vk  Vk(2(k-p+1)+1:1) 

(7.146) 

for  x^_k  outside  (<5N_k(2  (k-p) ) ,  <5N_k(2  (k+p)  +1) )  . 


3 .  Consequently  the  suboptimality  of  the  approximate  controller  at 
any  x^_k  is  bounded  by 


/Vk'Vk'Vk-1'!  2 

Kjj.jjd:!)  “  Kjj_k  (2  (k-p+1)  +1 : 1) 

1  iVk  j 

II 

1 

U* 

X 

1 

g 

M 

1 

>z 

1 

(7.147) 
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Proof : 


This  Proposition  follows  almost  immediately  from  Facts  7.1  -  7.3, 
and  Propositions  7.7,  7.12.  Details  are  given  in  Appendix  C.17. 

The  special  structure  of  the  JLQ  controllers  of  Propositions  7.12 
and  7.7  lets  us  describe  the  suboptimality  of  this  time-varying  approximate 
controller.  In  Figure  7.31  we  illustrate  the  bounds  of  (7.145)  and  (7.146) 
Notice  that  as  the  controller  pieces  (of  the  optimal  controller)  have 
domains  further  from  the  middlepiece,  they  look  more  and  more  like  the 
endpieces.  This  observation  motivates  our  substitution  of  the  endpiece: 
control  law  for  those  optimal  control  law  pieces  that  are  far  from  zero 
(i.e.,  outside  (6N_k(2k-p) ,  5N_k (2 (k+p) +1) ) . 

In  Figures  7.32  and  7.33  we  show  the  control  laws  and  xN_5  Xj^ 
mapping  for  the  optimal  JLQ  controller  and  a  p  *  3  approximation,  for 
am  example  problem  of  the  type  that  is  addressed  by  Proposition  7.7. 

The  solid  lines  indicate  the  optimal  controller  quantities.  The 
thick,  checkered  line  denotes  the  p«3  approximation.  Note  that  the  p=3 
approximation  coincides  with  the  optimal  solution  in  figures  7.32  amd  7.33 
except  for 

*N  e  (V4a)'5N-4(3,)  and  *N  e  (<W14)'  5N-4(16))  ' 

By  making  p  larger  amd  larger,  we  cam  increase  the  width  of  the 
interval  adx>ut  the  origin  where  the  true  optimal  controller  is  used 
(for  finite  time  horizon  problems,  if  we  make  p  =  N-l  then  the  approximate 
controller  is  in  fact  optimal) . 


Figure  7.32:  Optimal  JL Q  control  law  and  p  «3  approximation  at 
time  (N-4)  for  a  problem  addressed  by  Proposition  7.7.  The 
thick  checkered  line  denotes  the  approximation  and  the  solid 


line  indicates  the  optimal  controller. 
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Figure  7.33:  Optimal  controller  and  p  =  3  approximate  controller 

XN-4*""^  XN-3  maPP*n9s  for  a  problem  addressed  by  Proposition  7.7 
The  thick  checkered  line  denotes  the  p  ■  3  approximation  and 


the  solid  line  indicates  the  optimal  controller. 


In  sections  7.5  and  7.6  we  saw  that  for  fixed  k,  we  have  the 


convergence 

Ki|_k(2(k-p+l)  +  1:1)  +Vk(lill=i4-ka)  (7.148) 

as  p  grows  large.  Therefore  as  p  is  made  larger  we  also  reduce  the 
suboptimality  of  the  approximate  controller  of  Proposition  7.14  at  each 
xN_k  outside  (5N_k(2 (k-p)  •  (2  (k+p) +1) )  .  However,  the  number  of 

controller  pieces  that  must  be  calculated  and  implemented  increases 
linearly  with  p.  Thus  we  have  a  tradeoff  between  controller  suboptimality 
and  complexity. 

The  approximation  algorithm  that  is  described  above  results  in  time- 
varying  control  laws.  All  4p+l  pieces  of  the  approximate  controller 
must  be  computed  anew  at  each  time  (N-k) .  We  can  further  simplify  the 
computational  burden  of  controller  determination  if  we  use  the  4p+l 
closest  pieces  (to  zero)  of  the  steady-state  controllers,  as  given  by 
Proposition  7.8  and  7.13,  instead  of  the  finite  time  horizon  ones. 

Proposition  7.15 

1.  For  JLQ  problems  satisfying  the  assumptions  of  Propositions  7.7.,  7>8 
or  7.12,  7.13,  the  optimal  JLQ  controller  can  be  approximated  by  a 
constant  suboptimal  controller  as  follows: 

(1)  In  the  form  r=2  use  the  steady-state  controller  and  for  p  > 1 
fixed,  at  all  times  (N-k) • 


(i)  use  the  steady-state  middle-piece  control  law 


u»<0> 


if  cS00<-l><xN_k.<6<l> 


(ii)  use 


u  <1> 
00 


u  <-l> 
00 


if  «.<!>  <V„<(.<!> 

if  6„<-2> 


(iii)  if  p  >  2  amd  k  >  2,  then  for  A  =1, . . . ,min(p-l,k-l)  use 


u  <-2l> 

CO 

if 

<S  <-  (2i+l)  ><x  <6  <-2l> 

oo  N-k  00 

u  <-(2S.+l)> 

00 

if 

6  <- (2$,+2) ><x  <5  <- (2&+1) > 

00  N-k  00 

u  <2£> 

00 

if 

6  <2 1> 

00 

<xa-k6~u*L> 

U  <2H+1> 

00 

if 

6oe<21+l> 

(iv)  use  the  steady-state  endpiece  controllers 


Le 

u„ 

00 

(1) 

if 

4„<-2p> 

>  XN-k 

Re 

Uoo 

(1) 

if 

<  XN-k 

(Note:  we  are  using  the  <  >  notation,  indicating  indexing  from  the 
middle  outwards,  that  was  introduced  in  section  7.5  and  was  used  in 
Propositions  7.8  and  7.13). 


2.  At  times  (N-k)  >  (N-p)  the  resulting  controller  has  (4k+l)  pieces. 
When  (N-k)  <_  (N-p),  the  controller  has  (4p+l)  pieces.  As  the  time 
horizon  becomes  infinite  (as  N-k)  ■+■  »)  ,  the  suboptimality  of  the 
approximate  controller  at  each  x  value  is  bounded  by 
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*  l<“(D  -  Kao<2(p-l)>| 


(7.149) 


amd  as  p  ■“  this  error  at  each  x  converges  to  zero,  since 

lim  K  <2 (p-1) >  =  K^e(l)  . 

«•  00 

pr*» 

Proof :  Immediate  from  Propositions  7.8,  713  and  7.14.  0 

The  approximate  steady-state  controller  does  not  coincide,  in  general, 
with  the  true  optimal  JL Q  controller  at  any  x  value.  However,  for 
large  time  horizons  the  steady-state  control  law  pieces  (including  the 
joining  points  { 6^  (i ) } )  do  become  close  to  the  true  optimal.  In 
addition,  bounds  like  those  of  Proposition  7.14  hold  as  (N-k)  -*>  ». 

In  example  7.2  we  compare  the  controllers  and  expected  performance 
of  the  optimal  JL Q  controller  and  both  the  time-varying  auid  steady-state 
approximate  controllers. 


Example  7.2  («■  7.1,  6.1,  5.1): 

Consider  the  following  system  having  M=2  forms: 

■w  ■  *k  +  “k  if  v1 

Vi  ■  +  \  if  V2 

p(l,2 :x)  -  ( 1/4  I x| <1 

(  3/4  |x|>l 

p(l,l:x)  *  l-p(l,2:x)  p(2,2)*l  p(2,l)*0  • 


We  seek  to  minimize 


min 


U  /  •  •  •  / 

o 


Vi 


Vi1 


+  X. 


N  T 


<V( 


where  KT(1)  ■  0,  K^U)  =3.  The  form  structure  and  form  transition  pro¬ 
bability  p(l,2:x)  for  this  example  were  shown  in  figure  5.4. 

This  problem  was  examined  earlier  in  sections  5.3,  5.5,  6.3  and  7.2. 
It  is  a  "Case  1"  problem  at  time  (N-l) ,  in  the  sense  of  section  6.5. 

That  is,  it  satisfies  (6.108)  : 

(Wj-Wj.)  [Kt (2)  +  Q (2) )  -  (Kt(1)  +Q(1)]  =  j  [4-1]  =3/2  >  0. 

The  optimal  controller  in  form  1  at  time  (N-l)  is  specified  by  fact  6.9. 

This  example  problem  satisfies  the  assumptions  of  facts  7.1,  7.2  and 
7.3(1):  (7.22)  becomes  0  K^(2)  *  3  <  3.236068  .  For  this  example 

problem,  (7.35)  is  satisfied: 

2  /v 

a(l)  -  1  <  i(l  +  K^(2) )  -  j  (1  +  7/4)  -  11/8. 

D 

Thus  by  fact  7.6,  we  have  "situation  (1)"  of  figure  7.16  at  time 
(N-2) .  Consequently  the  optimal  controller  in  form  1  at  each  time  is 
specified  by  Proposition  7.7  and  the  limiting  controller  as  (N-k)  00 

is  given  by  Proposition  7.8. 

In  table  7.5  the  optimal  JLQ  controller  parameters  are  given 
for  three  time  steps  when  the  system  is  in  form  r=2. 


(N-k) 

Vk(li2) 

S-k(ls2) 

a (2)  -  b  (2)  LN_k(l:2) 

N 

3 

- 

- 

N-l 

3.2 

1.6 

.4 

N-2 

3.231 

1.615 

.3846 

N-3 

3.235 

1.618 

.3824 

00 

3.236 

1.618 

.3820 

Table  7.5:  optimal  JLQ  Controller  Parameters  in  Form  r*2 
for  Example  7.2 
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The  optimal  JLQ  controller  parameters  in  form  r=l  for  three  time  steps 
and  for  the  13  pieces  of  the  steady-state  controller  closest  to  zero 
are  given  in  Table  7.6  Only  the  left  sides  (x„  ,  <  0)  of  the  controllers 
are  specified  since  the  right-side  parameters  are  obtained  by  the  symmetry 
relations  of  Propositions  7.7,  7.8.  Also  since  LN  k(i:l)  =  ^  (i :  1)  , 

the  LN_k'  s  are  not  listed  separately . 

Comparing  the  columns  for  (N-k)  =  (N-3)  and  (N-k)  =  “  in  Table  7.6, 
we  see  that  the  values  of  each  parameter  {k^  ^<-i>,  k<-i>,  i=»l,...,5} 

are  very  close.  This  suggests  that  the  approximation  in  Proposition  7.15 
is  a  good  one. 

Let  us  compare  the  performance  of  the  optimal  JLQ  controller  and 
the  approximate  controllers  of  Propositions  7.14  and  7.15,  for  a  problem 
where  N=3.  This  is,  of  course,  a  limited  calculation  that  has  only 
academic  importance .  But  it  does  provide  some  insight  into  the  behavior 
of  the  various  controllers.  From  Table  7.6  we  see  that  the  optimal 
JLQ  controller^"  is: 


(1)  Optimal  JLQ  Controller: 


/-(. 78348) x 
'  o 

X 

o 

<-138.13 

-(.78352)x  -.00291 

o 

-138.13 

<X 

o 

<-54.686 

-(•7  8347)x 

° 

-54.686 

<X 

o 

<-32.285 

-(. 78600) x  -.05350 

0 

-32.285 

<x 

o 

<-15.315 

u  (x  ,r  *1) 
o  o  o 

- ( .78245) x 

o 

-15.315 

<x 

o 

<-7.0110 

-x  •  -1.000 

o 

-7.0110 

<x 

0 

<-3.2329 

- ( .6996) x 

o 

-3.2329 

<x 

o 

<3.2329 

*We  have  listed  the  control  laws  for  x  <  0, 

since  the  x 

>  0 

laws  are 

directly  obtained  by  symmetry. 


■w11  -  Vkt2k+lii,-Vk<0> 

0 

.6364 

.0949 

.6996 

.7007 

K^Caca) 

A 

—♦ 

1 

V 

-* 

1 

1 

- 

1.000 

1.000 

1.000 

1.000000 

Vk12*-1'!' 

•  W2> 

- 

.7647 

.7807 

.78245 

.7827063 

Vk(2k-2il, 

* 

- 

- 

.7849 

.78600 

.786190 

Vkt3X-3‘i> 

*  Vk<-4> 

- 

- 

.7822 

.7834? 

.7836775 

Vlc(2k-4:l) 

•  Vk4'^ 

- 

- 

- 

.78352 

.7837182 

Ck(11 

■  Vkaa' 

0 

.7647 

.7822 

.78348 

.7836889 

VkUkil* 

■  wl> 

- 

2.0000 

2.0000 

2.00000 

2.000000 

Vltl2k-2:1) 

•  lW‘3> 

- 

- 

.1075 

.10700 

.1069^49 

HN_kC2X— *:X> 

*  W5> 

- 

- 

.00582 

.0057804 

<Wa,w 

*  <Wl> 

j  - 

2.75 

3.2773 

3.32884 

3.3340644 

<w*k‘2,w 

*  <W*3> 

- 

- 

.6741 

. 80594 

.8201529 

<W2k-4il> 

•  Vk<*s> 

- 

- 

- 

.16848 

.2049996 

VkUk*n 

*  Wu 

- 

-1.0000 

-1.0000 

-1.0000 

-1.00000 

•  ,N-k<-3> 

- 

- 

-.05375 

-.05350 

-.0534S2 

Vlt(2k-4,1) 

■  w5> 

- 

- 

- 

-.00291 

-.002890 

W2k> 

•  iN-k<-i> 

- 

-2.75 

-3.2773 

-3.23288 

•3.3345113 

Vk(Jk-1' 

- 

-6.7749 

-6.9765 

-7.00102 

-7.0177289 

aN.k<2k-2i 

*  *N-k‘-3> 

- 

- 

-12.5375 

-15.31494 

-15.595679 

W2k*3> 

*  W*4> 

- 

- 

-31.1791 

-32.28468 

-32.507589 

Vkl2k-4> 

•  w*s> 

- 

- 

- 

-54.68637 

-72.108158 

Vk(l> 

- 

-6.7749 

-31.1991  -138.1297 

Vkai 

1 

5 

9 

13 

m 

Table  7.6:  Optimal  JLQ  Controller  Parameters  in  Form  r=l 


for  Example  7.2 


u  (x  ,r  -1)- 


r"( . 7822) x 


■(.7849)x  -  .05375 

'  - ( .7807) x. 


-1.000 


l- (.6949) x. 


xx  <  -31.179 
-31.179  <  xx  <  -12.538 
-12.538  <  xL  <  -6.9765 
-6.9765  <  x  <  -3.2773 
-3.2773  <  x  <  3.2773 


1- ( . 7647) x. 


U2(X2,r2=1)=  '  X' 


-1.00 


- ( . 6364) x. 


x2  <  -6.7749 
-6.7749  <  x2  <  -2.75 
-2.75  <  x2  <  2-75 


The  control  laws  for  the  suboptimal  controllers  of  Propositions  7.14  and 
7.15  for  various  p  values  are  as  follows: 


(2)  Proposition  7.14  :  Controller  with  p«2 


( .78348) x 


x  <  -32.285 
o 


-  ( . 78600) x  -.05350 
o 


u  (x  ,r  =1)=/  -(.78245)x 
o  o  o  \  o 


-32.285  <  x  <  -15.315 
o 

-15.315  <  x  <  -7.0110 

o 


'  -X 


-1.000 


„-  ( .6996)  x 


-7.0110  <x  <  -3.23329 

o 

-3.23329<  x  <  3.23329 

o 


ul(xl,rl=l)  and  u2(x2,r2=>l)  as  in  (l)  above. 


(3)  Proposition  7.14  Controller  with  p«l 


( .  78348)  x 


x  <  -7.0110 
o 


u  (x  ,r  =1) =  \ —x 
o  o  o  1  o 


-1.000 


-7.0110  <  x  <  -3.23329 

o 


- (.6996)x 


-3.23329  <  x.<  3.23329 


x  <  -6.9765 


[- ( .7822) x 


ui(Vrial)=  yxi  '  1-0000 

[-  ( . 6949) x 


U2(X2'r2=1)  aS  in  '  (2*  above‘ 


-6.9765  <  x1  <  -3.2773 
-3.2773  <  *l  <  3.2773 


(4)  Proposition  7.15  Controller  with  p=2 


f~  ( . 78369) x. 


|-(.78619)x  -  .05345 


u. (x. ,r.=l)=  /- ( . 782706) x . 

ill  ,  i 


hxi  -  1.0000 


k 


(. 700659) x. 


for  i  =  0,1. 


U(.  78369)  x2 

U2(X2'r2=1)=  VX2  -  1-0000 
l-(. 700659) x2 


x.  <  -32.5076 
i 

-32.5076  <  x.  < 

i 

-15.5957  <  x.  < 

i 

-7.0177  <  x.  < 

i 

-3.3345  <  x.  < 

i 


-15.5957 

-7.0177 

-3.3345 

3.3345 


X2  <  -7.0177 


-7.0177  < 

x2  <  -3.3345 

-3.3345  < 

x2  <  3.3345 

and 

Ui(xi'ri=2)  15  ("1*618^xi 


for  i=l,2. 


In  table  7.7  the  expected  costs-to-go  from  (xQ,ro=l),  for  several  different 

x  values,  are  listed  for  these  four  controllers, 
o 

Note  that  the  p=2  controller  (1)  obtains  the  same  performance  as 
the  optimal  (when  rounded  for  four  digits) .  The  p=l  controller  does 
almost  as  well.  Thus  for  this  example,  the  (4p+l)  -  piece  controllers 
of  Proposition  7.14  perform  well  despite  their  simplicity  (relative  to 


the  optimal  controller) .  Note  also  that  using  the  steady-state  control 


laws  in  (4)  does  not  seriously  degrade  performance  (compare  with  (2)  , 
which  has  the  same  number  of  pieces) . 


X  = 

o 

-200 

-100 

-20 

.10 

-5 

-1 

controller 

(optimal) 

(1) 

31340 

7835 

313.1 

78.25. 

18.33 

.6996 

controller 

P*2 

(2) 

31340 

7835 

313.1 

78.25 

18.33 

.6996 

controller 

P=1 

(3) 

31340 

7835 

313.2 

78.28 

18.33 

.7087 

controller  (4) 

p=2 

steadv-state 

31340 

7835 

313.1 

78.25 

18.33 

.7024 

Table  7.7:  Expected  Costs-to-go  from  (x  ,r  =1)  for 
_  o  o 

Different  Controllers  in  Example  7.2  (entries 
rounded  to  four  places) . 

It  is  important  to  note  that  it  is  the  special  structure  of  the  JLQ 
problems  which  are  addressed  by  Propositions  7.7,  7.8,  7.12  and  7.13 
that  enables  us  to  implement  the  approximate  controllers  denoted  above. 

We  are  able  to  obtain  the  controller  pieces  that  are  required  by  the 
approximate  controllers  of  Propositions  7.14,  7.15  without  using  the 
algorithm  of  section  7.2  for  these  special  problems. 

For  the  general  class  of  problems  of  Chapter  5,  the  optimal  JLQ 
controller  will  not  have  the  nice  structure  of  the  problems  of  sections 
7.5  and  7.6.  We  will  not  be  able  to  compute  only  the  (4p+l)  closest 
pieces  to  zero  of  the  optimal  controller.  However,  the  approximations 
that  we  used  above  can  be  interpreted  in  a  way  that  suggests  an  alternate 


suboptimal  approximate  controller  that  is  applicable  to  the  general 
problem  class-  This  alternate  controller  does  not  coincide  with  the 
controllers  in  Propositions  7.14,  7.15.  It  does  not  assume  the  special 
problem  structure  possessed  by  the  problems  of  Sections  7.3  -  7.6. 

In  figures  7.19  and  7.28  we  saw  that  the  cost-to-go  pieces  that 
correspond  to  changing  the  transition  probability  pieces  that  x  is  in 
at  far  future  times  tend  to  look  alike.  That  is,  using  the  control 
to  change  transition  probability  pieces  at  times  further  and  further  in 
the  future  has  less  and  less  effect,  in  particular,  the  expected 
costs  of  such  strategies  became  close  to  the  endpiece  costs  (ie. ,  of 
never  changing  the  transition  probability  pieces  that  x  is  in) . 

One  interpretation  of  the  time-varying  controller  of  Proposition  7.14 
is  that  it  is  a  rinite  look-ahead  approximation  of  the  controllers  in 
Propositions  7.7  and  7.12.  It  assumes  that  the  transition  probability 
pieces  that  the  x  process  is  located  in  will  either  change  in  the  next 
p  time  steps,  or  not  at  all.  The  approximate  controller  ignores 
eventualities  that  might  occur1  beyond  a  fixed  planning  time.  Ignoring 
the  far  future,  optimality  is  lost  but  the  computational  burden  of 
determining  and  the  complexity  of  implementing  the  resulting  controller 
is  greatly  reduced. 

In  the  remainder  of  this  section  we  present  and  demonstrate  a 
general  p-step  (finite)  look  ahead  approximation  to  the  optimal  JL Q 


Droblem  than  the  true  one. 


The  true  optimal  JLQ  controller  minimizes 


/*N-1 


Vk'Vk'W  " 


Vk'-'Vi 


XN-k+  +1  2(rN-k+5,+l) 


XN-k-Ht+l  S(rN-k+£+l) 


P{rN-k+£+l} 


i-Q  *- 


UN-k+^'  R  ^rN-k+£^ 


V  (x  ,r  ) 
N  N  N 


for  k  >  1  where 


VN(x,r>  *  x  K^r)  +  x  HT(r)  + 

The  p-step  look  ahead  controller  that  will  be  derived  in  Proposition 
7.16  is  the  optimal  solution  of  the  control  problem: 


Vk'Vk'Vn1  ■ min  E 

^-k,. .  ^-k+p-l 


“’’f  XN-k+£+l  S(rN-k+£+l): 


’Sj-k+ji+l  S(rN-k+£+l) 


P<rN-k+£+l) 


i=0|  UN-k+£  R(rN-k+£) 


V  (x  .  r  \ 

NV  N-k+p'  N-k+p' 


for  k  >  1.  That  is,  we  only  consider  costs  p  times  in  the  future  and 


we  charge  the  terminal  cost  at  time  N-k+p.  At  times  (N-p) , (h-p-1) , . . . , (N 
the  controllers  are  the  same  for  both  problems.  At  all  other  times  we 


use  the  control  law  instead  of  the  true  problem  optimal  control 

Vk  • 

The  performance  of  this  p-step  look  ahead  controller  and  of  the  true 
JLQ  controller  can  be  bounded  by  the  solutions  of  two  other,  different 
control  problems. 


Proposition  7.16  (p  step  Look-Ahead  Controllers); 

1.  For  any  JLQ  problem  as  formulated  in  (5.1)  -  (5.6)  of  section  5.2, 
the  optimal  JLQ  controller  can  be  approximated  as  follows : 

for  p  ^  1  fixed, 

•  at  times  (N-p) , . . . ^-1)  use  the  true  optimal  control  laws 

VplxN-P'rs-p=jl . ViVrVi 

for  each  form  j  6  M  ;  these  laws  are  computed  using  the 
algorithm  of  section  7.2. 

•  at  all  times  (N-k)  for  k  >  p,  use  the  control  law 

VpVk’Vk"11 

for  each  j  6  M  • 


For  k  >  p,  this  control  law  applied  at  time  (N-k)  has  a  fixed  number 


m  mi  of  pieces  (for  each  j  6  M) .  These  control  laws  need  only  be 
N-p  J  “ 


calculated  once.  Let  is  denote  the  expected  cost-to-go  from  (x  ,r  ) 


that  corresponds  to  this  approximate  controller  by  (xN_k,rN_k) 


It  is  related  to  the  true  optimal  JLQ  expected  cost-to-go,  V  ,  (x.,  ,  ,r,„  ,  1 

N-k  N-k  N-k 


V  'Vk'V^'^ll-k'Vk'^-k'^  ''N-k'~N-k'‘N-k"J/'  'N-k'"N-k'*N-kJ' 


(7.150) 


for  each  j  e  M 

where  V  ^XN-k,rN-J^^  sol-ution  of  the  problem 


VL'P  (x _ ,  r 


N-k'Vk’j)  “  111111 


VkM,'Vk+P+i 


XN-k+JUl  2(rN-k+£+l)) 


*N-kU+l)  s(W+l> 


j  P(rN-kU+l) 


UN-kv7  RlrN-k+/} 


(7.151) 


411(1  VN-k(xN-k'rN-k*^  is  ****  soluti°n  of  the  problem 


N-k(Vk'Vk,j)  mmiR  E 

UN-k'-,,UN-k+p-2 


XN-k+Jt+l  2  rN-kH+l 


*M  -k+A  +1  S ( rN-k+£  +1  ^ 


P(rN-K-Ht+l) 


A-0  2 

LUN-k+2-  R  ^N-k-fJl5 
+  VT(XN-k-»p-l  ,rN-k*p-«  * 


(7.152) 


where 

U  a  (rN-k+p-l)  R(Vk+p-l)  Vk+p-1 

VT V  N-k+p-1 ,  N-k+p-l;  "  2  , 

b  (r  ) 

k  N-k+p-l' 


N 

e{p(z^  )  jx^  =  0} 

JUp 


E{GT(rN} lXN  =  °}  ,  (7.153) 

3.  Consequently  the  suboptimality  incurred  by  using  the  approximate 
controller  instead  of  the  optimal  one  is  bounded  as  follows: 


VN-k(XN-k,rN-k) 

< 

‘"M'Vk'Vk1  ' 

'Vk'Vk'Vk1 

•vL,P(XH-k'rN-k> 

- 

- 

(7.154) 

D 

Proof :  See  Appendix  C.18. 

Note  that  vL (xN_jc » rN— (7.151)  is  the  optimal  expected  cost 
for  a  problem  where  no  costs  are  incurred  after  time  (N-p) .  The  cost 
VL,P(xN_k,rN_k)  can  be  confuted  using  p-steps  of  the  algorithm  of 
section  7.2,  if  we  set  the  terminal  costs  to  zero: 

KT(i)  *  HT(j)  =  GT(j)  "  °»  V  j  6  M  . 

This  cost  will  nost  exceed  the  optimal  expected  cost-to-go  from  (x  ,r 

N-k  N-k 

for  the  true  problem  (since  in  the  true  problem  V  (x..  ,r  )  >  0) . 
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The  upperbound  cost  function  (xN_^,r  xn  (7.152)  corresponds 

to  the  true  problem  with  the  added  constraint  that  x  be  driven  to  zero 

in  p  steps,  and  that  x  be  kept  at  zero  thereafter.  This  problems 's 

solution  will  exceed  the  optimal  expected  cost-to-go  of  the  true  problem 

because  of  the  extra  constraint.  The  first  term  in  (7.153)  is  the 
2 

u,  ,  ,  ,  R(r,  ,  ,  ,)  cost  that  results  from  driving  x„  ,  ,  to  zero.  The 

N-k+p-1  N-k+p-1  N-k+p 

second  and  third  terms  are  costs  incurred  by  keeping  x  at  zero.  Note 
that  they  are  zero  for  the  problems  discussed  in  chapter  6. 

We  conclude  this  section  with  an  example  that  demonstrates  the  use  of 
the  section  7 . 2  algorithm  for  a  more  general  problem  than  those  of  chapt .  6 
and  that  illustrates  the  application  of  the  preceding  proposition. 


Example  7.3.* 

Consider  a  system  whose  form  structure  is  as  shown  in  figure  7.34. 
This  system  might  represent  the  following  situation: 


rk  =  2 


p(l,2 :x) 


P(2,l) 
p  (1, 3 :x) 


normal  operation 

degraded  operation  (repairable  failure) 

nonrepayable  failure 

one-step  probability  of  repairable 
failure  occurrence  (x-dependent) 

one-step  probability  of  repair 

one-step  probability  of  nonrepayable 
system  failure  occurrence. 


The  form  transition  probabilities  from  r^-!  are  piecewise  constant  in  x 


(but  p (2 , 1) ,  p (2 , 2)  and  p(3,3)  are  x- independent ) . 


tion,  r=°2=Degraded  Operation,  r=3=System  Failure  (Same 
as  Example  5.2.) 


(p(l,l:x)  p(l,2:x)  p  (1 , 3  :x))  =  <  (.7 


(.89 

.1 

.01) 

if 

1*1  < 

(.7 

.2 

.1  ) 

if 

1<  |  x  |  < 

(  0 

.2 

.8  ) 

if 

[  x  j  >  2 

p(  2 , 1)  =p(  2, 2)  =  .5 

Thus  the  members  of  pieces  in  each  of  the  form  transition  probabilities 
are 

'll  *  'l2  =  5 
\2  -  3 

v\,  =  v__  =  =  V,,  =  \7  =  \7  =  l  . 

21  22  23  31  32  33 

The  x-dependent  transition  probabilities  are  shown  in  figure  7.35. 

The  dynamic  equations  of  this  system  are: 


*k+l  =  Xk  +  \ 

Vi =  xk +  ¥ 

Vi  =  xk 

That  is,  b(l)  -  1  b(2)  =  1/2 

a(l)  =  a (2)  =  a (3)  =1 


in  form  r=l  (normal  operation) 
in  form  r=2  (degraded  operation) 
in  form  r=3  (system  failure)  , 
b(3)  =  0 


The  cost-parameters  are 

Q(l)  «  Q(2)  -  1 
Q  ( 3 )  -  0 
R (1)  *  2 
R(2)  =  R ( 3 )  =  1 

VD  =  kt(2)  =  kt(3)  =  0 


G_(3)  *  1000 

T* 


(penalty  for  system  failure) 


Note  that  we  are  willing  to  spend  more  control  energy  when  the  system  is 
in  a  degraded  mode  than  when  it  is  operating  normally. 

Clearly  we  have  at  all  (N-k) 

V* k'rN-k-3i  ■  100°- 
Vk'WW31  *  °- 

If  the  nonrepairable  failure  occurs  (that  is,  if  we  enter  form  3),  then 
we  are  charged  GT(3)  =■  1000  regardless  of  what  we  do.  The  optimal  strategy 
is  to  shut  the  system  off  (set  u*0) . 

Using  the  algorithm  of  section  7.2  we  can  compute  the  optimal  controllers 
in  forms  r=l  and  2,  backwards  in  time.  We  find  that  at  time  k  =  (N-l) ,  the 
numbers  of  pieces  of  the  optimal  JLQ  controllers  in  each  form  are 

mN-l(1)  *  5  ®N-1U)  “  1  mN-l(3)  *  1 

In  form  r  .  =»  2, 

N— 1 


=  .8 

2 

*N-1 

Vl(Vl'rN-l*21 

s  -.4 

XN-1 

XN(XN-l'r»-l‘2) 

=»  .8 

Vl 

The  controller  in  form  r„  ,»1  is  summarized  by  table  7.8  below. 

N-l 


•r» 


for 

i  Vi'Vj'Vi*11 

N'VrVr1 

Vi <  -21-934 

t. 18182) t 800 

-(.09091)x(|_1 

| 

(. 90909) X  , 

N- 1 

! 

21.934  <  <  -1.495 

1  -x  -  L* 

|  N*1 

-* 

-1.495  <  <  1.495 

*10 

|  -(.3Ul)xN_l 

(.6689)x[(_1 

1.495  <  <  21.934 

’Vi  *  1‘ 

r 

21.934  <  Vl 

»  800 1 

-1.0909UxlJ_1 

l.90909)x(I_l 

Table  7.8:  Optimal  Controller  at  time  k  =  N-l  in  form  rN_^*l, 

for  example  7.3. 

In  form  r„  =1  the  optimal  controller  actively  hedges-to-a-point  for 
N-l 

certain  xN_^  values.  Specifically,  from  (xN_^,rN_^=l)  the  optimal  controller 

•  drives  I x  |  >  2  for  1 x  _ |  >  21.934 

N  N— 1 

•  hedges-to-a-point ,  obtaining 

xN  =  1+  for  -21.934  <  xN_1  <  -1.495 

xN  =  l“  for  1.495  <  xN-1  <  21.934 

•  drives  |x  j  <  1  for  |x  |  <  1.495. 

•  N-l 

The  optimal  controller  from  (x„  ,  ,r„  =1)  avoids  the  values  in  the 

N-l  N-l 

intervals  (-19.93999,  -1)  and  (1,19.9399). 

Note  that  the  controller  completely  avoids  the  intervals  (-2,-1)  and 

(1,2)  where  the  one-step  probabilities  of  normal  operation  and  nonrepair- 

1 

able  failure  take  intermediate  values  .  Either  the  controller  resigns 
*See  figure  7.35 


itself  to  failure  of  some  kind  (for  |x  ,|  >  21.934)  or  it  forces  the 

N-l 

system  into  the  region  of  xN  values  where  the  probabilities  of  both 
nonrepairable  and  repairable  failures  are  lowest. 

If  the  repairable  actuator  failure  has  already  occurred  (that  is, 
r  j»2) ,  then  the  system  is  equally  likely  to  be  in  forms  1  or  2  at  the 
next  time  regardless  of  the  control  that  is  applied.  Thus  the  single¬ 
piece  structure  of  i,rN  i=^  results.  Another  way  of  viewing 

this  is  as  follows:  when  r  =2,  the  conditional  cost  V„(x  l(r„  =2) 

N— 1  N  N1  N— 1 

has  no  discontinuities  and  therefore  there  is  no  hedging- to-a-point  from 

(x^^r  2=1)  •  Since  ^N^xNJrN  ^=2)  has  only  one  piece,  the  same  control 

law  is  used  for  all  x„  ,  when  r  =2.  Thus  there  is  no  region  of  avoided 

N-l  N-l 

x  values  from  ( x._  .  ,r =2). 

N  N-l  N-l 

Using  the  algorithm  of  section  7.2  again,  we  find  that  at  time 
k  =  (N-2) ,  the  numbers  of  pieces  sf  the  optimal  controller  are 

<%-2W  *  5  "n-2<2)  -  5  V2131  * 


In  form  r^  2=1  we  have  the  controller  that  is  summarized  in  table  7.9 


f“  ^N-2  (XN-2 

'V2*1  Vi'V:'1 

Ct<-2’1)  Vl'V 

•2,rN-2=1) 

XN-2  <  -22'630 

(. 30508) x2_2+800 

-(.  15254)  Xxl 

N-2 

( .  84746)  Xj 

-22.630  <  x„  ,  <  -1.8297 
N-2 

2XN-2+4XN-2+22-559 

-V2  -i+ 

-1+ 

•1.8297  <  x*T  _  <  1.8297 

N-2 

(. 90691) x2  +18. 9 
N-2 

-(. 45346 )XN.2 

1 

( .  54654)  x; 

1.8297  <  xKT  .  <  22.630 

N-2 

2XN-2-%-2+22-559 

-XN-2  +  1_ 

l" 

22. 630  <xN_2 

(.30508)^2  +  800 

-(. 25254)^ 

(.  84746  )x, 

i 

Table  7.9:  Optimal  Controller  at  time  k  =  N-2  in  form 


1,  for  example  7.3 


In  form  r  =1  the  optimal  controller  does  the  following: 

N-2 

•  keeps  lx  , |  >  19.1803  if  lx,  |  <  -22.630 

N— 1  N— 2 

•  hedges- to-a-point,  obtaining 

x  .  =  -1+  for  -22.630  <  x„  .  <  -1.8297 
N— 1  N-2 

x  ,  =  l"  1.8297  <  X  „  <  22.630 

N-l  N-2 


The  optimal  controller  from  (xN  2,rN  2*^  avoids  (-19.1803,-1)  and 
(1,19.1803).  Thus  as  at  time  k  =  N-l,  the  optimal  controller  from 
rN  2=1  completely  avoids  the  intermediate-level  failure  risk  regions 
(-2,-1)  and  (1,2). 

In  table  7.10  the  optimal  controller  from  rN_2=2  is  summarized. 

Here  the  optimal  controller  does  not  hedge-to-a-point  with  uN_2 
/V  . 

since  V  , (x„  .1  r„  =2)  has  no  discontinuities.  However,  there  are 
N-l  N-l  N-2 

intervals  of  avoided  x„,  ,  values : 

N-l 


(-23.607123,  -  20.41) 


(20.41 


23.607123) 


T 


xH_2  <-32.406 

<1.38611x^.400 

-1.54305)*  , 

N— i 

MxtrtJVi 

-32.406 

<  Vj  <3.4428 

U.5)x^_2rU.25ut|.2.5.3451 

-(•75)xn_2-.3125 

1.b2S)xn_2-. 15625 

-3.4428 

<  <  3 . 4428  I 

u.ioaiu^.j  +  s 

-(.644U)xn_2 

!(.  697945)  xu  , 

1  N‘J 

3.4428 

<  <31.406  j 

a.5)^.2-l.25V245.845 

-l^Slx^.-niS 

K.6251XJ,,..  15625 

| 

32.406 

*  “  N-2  | 

<1.0861)x^_2  t  400 

-(.54305)xh_2 

jl.  72848) 

Table  7.10:  Optimal  controller  at  time  k  =  N-2  in 

form  r  _  =  1,  for  example  7.3 

N-  o 


for 

]VN-3UN-3,rN-3*11 

V3<XN-3'rN-3‘! 

| 

) 

tXN-2(V3-V3*1> 

*n-3  < -39.800 

(.345211x2  ,*880 

N-  3 

1  1 

U8274)V3 

-39.8000^  <  -23.156!  ( .4) x2_ ( .  2) x^.801  ■ 16 

'•2  xn-3  --05  , 

■SxN-3  "°5 

-23. 156<x  <-1.9590 

H—  J 

2  ’ 
1  ,  ♦  4x  +  31.2390  1 

|  H-J  fl— J 

"“n-3  *l*  | 

-l* 

-1.9S90‘*  < 1.9590 

N-3  | 

|  ( . 97906) x„  ,.27.321  ! 

1  M-J  | 

-(. 48953) xN3  j 

| 

(.51047)xNO 

1.9590  <*N  3 <  23.156 

Vi  -  4Vi  ♦31-”90  1 

1 

’Vs  T  1‘  ! 

i* 

23.156  <icN_3  <  39.800  j 

i 

M,vi  ■•JV3*901-16 ! 

-.2xN_3..05  ; 

.8xN-2.,05 

39.800 

| 

( .  34521)  x^j  ♦  880 

-(. 17260) »si  . 

(.3274,xn.3 

Table  7.11:  Optimal  Controller  at  time  k  *  (N-3)  in 
f0rm  rN-3  =  for  example  7.3 
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Using  the  algorithm  of  section  7.2  once  again,  we  find  that  at 


time  k  =  (N-3)  : 


V3(1)  =  7  \-3(2)  =  9  \-3(3)  *  X' 

In  form  rN  3  * 1 ,  the  optimal  controller  is  given  in  table  7.11. 

In  form  r  =1  the  optimal  controller  hedges-to-a-point  for 
N-3 

1 23.156 1  <  xN  3  <  j  1.950 1  .  The  optimal  controller  avoids  the  intervals 
of  values: 

(-32.93052,  -31.89)  (31.89,  32,93052) 

(-18.5748,  -1  )  (1,  18.5748) 

Once  again,  the  optimal  controller  in  form  1  avoids  the  regions  of  x 
values  that  correspond  to  an  intermediate  level  of  failure  risk. 

The  optimal  controller  from  rN_3=2  is  given  in  table  7.12.  As 
before,  the  optimal  controller  form  (x„  ,,r„  «2)  does  not  hedge-to-a- 

N-o  N— J 

point.  The  regions  of  2  values  that  are  avoided  by  the  optimal  JLQ 

controller  from  (x„  _,r  =2)  are 

N-3  N-3 

(-32.962,  -31.8608)  (31.8608,  32.962) 

(-24.1187,  -21.2383)  (21.2383,  24.1187) 

We  will  compare  the  performance  of  the  optimal  controller  above 
to  that  of  a  p  =  1  step  look-ahead  suboptimal  controller.  As  spec¬ 
ified  by  Proposition  7.16,  the  p=l  step  look-ahead  controller  uses 
the  optimal  controller  of  time  k=(N-l)  at  all  times.  In  table  7.13 
we  compare  the  expected  cost-to-go  achieved  by  the  optimal  controller 
and  the  p=l  suboptimal  for  a  N-3  time  step  problem. 


478 


Optimal  expected 

cost  from  r  =1 
o 

28.3001 

61.239 

751.239 

1432.34 

p=l  suboptima 1 

expected  cost 

from  r  =  1 
o 

28.4229 

61.2765 

751.2765 

1465.57 

optimal  expected 

cost  from  r  =2 
o 

13.3087 

47.0015 

634.666 

2448.85 

p=l  suboptimal 

expected  cost 

from  r  =2 
o 

13.4635 

52.8952 

740.795 

2611.25 

suboptimality 

of  p=l  controller 

from  r  »  1 
o 

.1228 

(0.43%) 

.0375 
(0.  06%) 

.0375 

(0.005%) 

33.33 

(2.33%) 

suboptimality 

of  p  =  1 

controller 

from  r  -2 
o 

.1548 

(1.157%) 

5.8937 

(12.54%) 

106.129 

(16.72%) 

162.4 

(6.63%) 

Table 

7.13:  Optimal 

Expected  Costs 

Obtained  by 

the  Optimal  and  p=l  Look-ahead  controllers. 

The  suboptimal  controller  performs  well  at  each  value  when 
the  system  begins  in  r  ■  1.  This  is  because  in  form  1  the  p=l  sub- 

O 

optimal  controller  can  hedge- to-a-point.  In  form  2  the  p=l  controller 
has  a  single  control  law;  thus  the  suboptimal  controller  cannot 

hedge-to-a-point  until  the  system  is  repaired  (that  is,  until  it  leaves 
form  r=*2)  .  Despite  this  fact  the  p=l  controller  performs  well  for 

r  =»2. 
o 

In  this  section  we  have  presented  and  illustrated  via  examples 
certain  suboptimal  approximations  to  the  optimal  JLO  controller.  We 
first  developed  an  approximate  controller  for  the  single-form  trans¬ 
ition  control  problems  that  were  described  in  Propositions  7.7,7.8,7.12 
and  7.13.  Then  a  p-step  look-ahead  controller  was  described  in 
Proposition  7.16.  This  p-step  suboptimal  controller  is  applicable 
for  the  general  class  of  JLO  problems  of  chapter  5 . 


7.8  Summary  of  Part  III  J 

% 

In  part  III  (chapters  5,  6  and  7)  we  have  considered  scalar  JLO  con¬ 
trol  problems  that  involve  state-dependent  structural  changes.  This  class 
of  nonlinear  stochastic  control  problems  yields  controller  designs  which 
endow  systems  with  fault-tolerance,  in  that  the  controller  takes  into  ac¬ 
count  known  system  limitations  and  failure  likelihoods  so  as  to  achieve 
the  best  tradeoff  between  system  reliability  and  performance  goals.  The 
optimal  controller  attempts  to  minimize  the  cost  incurred  by  the  usual  LO 
regulator  action,  and  by  driving  the  system  state  to  regions  where  the 
likelihoods  of  undesirable  form  shifts  are  reduced.  We  have  formulated 
and  solved  a  class  of  scalar-in-x,  noiseless  JLO  problems  with  x-dynamics 
that  would  be  linear,  if  not  for  random  x-dependent  jumping  parameters. 

These  problems  possess  form  transition  probabilities  that  depend  upon  x 
in  a  piecewise-constant  way.  For  this  class  of  problems  we  have  deve¬ 
loped  a  procedure  that  calculates  the  optimal  expected  costs-to-go  and 
control  laws  "off-line",  in  advance  of  system  operation.  The  procedure 
determines  the  optimal  controller  inductively,  backwards  in  time  (for  fi¬ 
nite  time-horizon  problems) . 

The  basic  idea  of  the  solution  procedure  is  simple,  and  the  solu¬ 
tion  structure  is  conceptually  straightforward.  However,  the  notation 
that  is  required  to  describe  the  solution  becomes  quite  complex.  Essen¬ 
tially,  the  nonlinearity  of  the  system  dynamics  (due  to  the  x-dependence 
of  the  form  transition  probabilities)  is  converted  into  computational  com¬ 


plexity  in  the  determination  of  the  controller.  At  each  time  the  optimal 
controller  is  obtained  by  calculating  and  comparing  growing  number  of 
quadratic  functions .  These  auadratic  functions  are  computed  via  Riccati- 


like  difference  equations.  It  is  the  piecewise-constant  structure  of  the 
form  transition  probabilities  that  allows  us  to  do  this.  In  chapter  5  the 
general  problem  under  consideration  was  formulated  and  one  stage  of  a  simple 
problem  was  solved  from  first  principles.  Guided  by  intuition  gained  fro™ 
this  example,  a  general  one-stage  solution  procedure  was  developed  in  sec¬ 
tion  5.4.  We  established  that  the  optimal  control  laws  are  piecewlse- 

linear  in  x  (with  x\  x°  terms)  and  the  optimal  expected  costs-to-go  are 
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piecewise-quadratic  in  x  (with  x  ,  x  ,  x  terms) .  The  different  control¬ 
ler  pieces  arise  from  using  the  control  to  actively  hedge.  Intuitively, 
at  each  stage  the  optimal  controller  must  take  into  account  what  the  ex¬ 
pected  cost  of  driving  x  into  different  regions  will  be,  where  different 
values  of  the  form  transition  probabilities  apply.  As  the  control  problem 
is  solved  backwards  in  time  (using  dynamic  programming) ,  the  controller 
must  take  into  account  what  the  effects  of  active  hedging  will  be  at  the 
intervening  times.  The  number  of  pieces  of  the  controller  grows  addi- 
tively.  This  additive  increase  depends  upon  the  number  of  different  forms 
into  which  the  system  can  change  (from  its  current  one)  and  the  number  of 
pieces  in  the  relevant  piecewise-constant-in-x  transition  probabilities. 
Thus  there  is  a  tradeoff  between  the  accuracy  of  the  modeling  of  failure 
probability  state-dependence  (via  piecewise-constant  approximations) 
versus  the  compute tional  burden  of  control  law  determination  and  the  com¬ 
plexity  of  the  controller.  In  chapter  5  we  also  identified  several  basic 
qualitative  properties  of  the  optimal  JLO  controller.  These  included 
hedging- to- a-point,  regions  of  avoidances  and  the  endpieces  and  middle- 
pieces  of  the  expected  costs-to-go  and  control  laws. 

In  chapter  6  we  investigated  these  properties  in  detail.  In  parti¬ 


cular  we  examined  the  behavior  of  the  optimal  control  laws  and  expected 


costs- to-qo  when  x  is  far  from  zero  ("endpieces" )  ^  and  when  x  is  near  zero 
2 

("middlepieces")  .  Over  these  regions  of  x  values,  V  (x  ,r  =j)  can  be  cot-'  • 

K  K  K 

puted  from  sets  of  recursive  difference  equations. 

The  equations  specifying  these  endpieces  are  middlepieces  of  the  opti¬ 
mal  controller  are  the  same  as  those  that  solve  certain  corresponding  x- in¬ 
dependent  JLQ  problems  (as  in  chapter  3) .  Upper  and  lower  bounds  on  the 

expected  optimal  cost-to-go  when  x  is  between  these  endpiece  and  middle- 

2 

piece  domains  were  also  obtained  .  In  chaDter  7  we  used  the  combinatoric 
properties  established  in  chapter  5  and  the  results  of  chapter  6  to  con¬ 
struct  an  algorithm  for  the  efficient  computation  of  the  optimal  controlle-. 
This  algorithm  was  presented  in  flowchart  form  and  described  in  detail. 

The  basic  idea  is  to  compute  the  optimal  cost  function  V (x  , r  =j)  at  time 

iC  JC 

stage  k  (and  in  each  form  j)  one  piece  at  a  time,  starting  on  the  left 
(with  the  left  end-piece).  Using  Propositions  5.2  and  5.3,  the  number  of 
calculations  and  computations  that  this  solution  algorithm  must  make  is 
greatly  reduced  from  those  of  the  "brute  force"  solution  technique  in 
chapter  5.  This  solution  algorithm  (developed  in  section  7.2)  is  appli¬ 
cable  to  all  problems  satisfying  the  requirements  of  Proposition  5.1.  The 
class  of  JLQ  problems  addressed  by  Proposition  5.1  is  extremely  rich.  The 
resulting  optimal  controllers  can  exhibit  a  wide  variety  of  qualitative  be¬ 
haviors.  Analytical  characterizations  of  these  JLQ  controllers  that  are 
sufficiently  general  to  encompass  the  entire  problem  class  tend  to  be  unin¬ 
formative,  since  so  many  diverse  behaviors  must  be  simultaneously  consi¬ 
dered.  We  chose,  therefore,  to  focus  our  attention  on  problems  that  lend 

^"For  all  problems  of  chapter  5. 


2  1  o 

For  problems  with  purely  quadratic  costs  (no  x^  and  terms) 
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insight  into  the  kinds  of  qualitative  JLQ  controller  behaviors  that  are 
appropriate  in  fault-tolerant  control  applications.  Our  vehicle  for  doing 
this  was  the  single  form-transition  problem  that  was  developed  in  sections 
6.5  and  6.6  and  specialized  to  two  archetypical  problems  in  sections  7.3- 
7.6.  In  one  of  these  classes  the  two  goals  of  high  performance  and  high 
reliability  are  commensurate.  In  the  other  class  they  are  at  cross  pur¬ 
pose.  We  examined  the  parametric  dependence  of  the  hedging  regions, 
regions  of  avoidance,  stability  properties,  and  local  minima  in  the  ex¬ 
pected  costs-to-go  for  these  controllers.  Under  certain  assumptions  for 
algorithm  of  section  7.2  reduces  to  the  solution  of  (increasingly  many) 
sets  of  difference  equations  as  N-k  increases.  This  makes  these  problems 
amenable  to  further  detailed  analysis,  and  it  lets  us  illustrate  some  of 
the  controller  properties  and  qualitative  issues  that  arise  from  the  use 
of  control  to  achieve  both  reliability  and  performance  goals. 

For  the  general  problem  of  Part  III,  as  the  time  horizon  of  the  pro¬ 
blem  becomes  infinite  the  number  of  pieces  in  the  optimal  controller  be¬ 
comes  infinite.  That  is,  the  optimal  infinite  time-horizon  problem  can¬ 
not  be  obtained  by  any  finite  algorithm.  For  the  two  problem  classes  of 
sections  7.5  and  7.6  we  could  analyze  the  infinite  time  horizon  behavior 
of  the  controller  and  obtain  the  optimal  steady-state  controllers  as 
(N-k)— *90,  since  the  optimal  controller  at  each  time  can  be  obtained  from 
the  solution  of  increasingly  many  difference  equations  without  making  the 
comparisons  and  tests  in  the  solution  algorithm  (of  section  7.2)  that  are 
needed  in  general. 

The  steady-state  solutions  that  are  obtained  for  these  two  problem 
classes  exhibit  a  structure  that  suggests  a  "natural"  approximation  to  the 
steady-state  optimal  controller  (both  for  these  orob)e™=  and  the  general 
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class  of  problems  in  Chapter  5) .  These  approximations  correspond  to 
"finite  look-ahead"  controllers  which  ignore  eventualities  that  might  occu 
beyond  some  fixed  planning  time.  By  ignoring  the  far  future,  optimality 
is  lost  in  these  controllers  but  the  computational  burden  of  determining 
them  and  the  complexity  (and  cost)  of  their  implementation  is  reduced. 

This  finite  look-ahead  controller  was  developed  in  section  7.7. 

In  conclusion,  in  this  part  of  the  thesis  we  have  formulated  and 
solved  a  class  of  nonlinear  discrete-time  stochastic  control  problems.  Th 
optimal  controller  is  obtained  recursively ,  backwards  in  time,  by  an  algo¬ 
rithm  which  was  presented  in  flowchart  form.  Less  complex  but  suboptimal 
approximations  of  this  optimal  controller  were  also  presented.  For  spe¬ 
cial  classes  of  these  problems,  the  optimal  controller  algorithm  collapses 
to  a  set  of  recursive  difference  equations .  These  special  problems  are 
examined  in  detail.  In  the  next  part  of  this  thesis  we  will  extend  the 
results  of  Part  III  to  address  more  general  problems  than  those  of 


chapter  5 . 


PART  IV 


EXTENSIONS  TO  THE  SCALAR 
X-DEPENDENT  NOISELESS  JLQ  PROBLEM 
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THE  JUMP  LINEAR  PIECEWISE-QUADRATIC  CONTROL  PROBLEM 


o 


8.1  Introduction 

In  this  part  of  the  thesis  we  extend  the  range  of  control 
problems  for  which  the  methods  of  Parts  II  and  III  are  applicable. 
In  this  chapter  we  modify  the  scalar  JI>2  problem  of  chapters  5-7 
to  include  a  more  general  class  of  x-operating  and  terminal  costs. 

Specifically,  we  consider  x  operating  costs  Q(x^,r.  )  and  terminal 

,  ,  ,  .  ,  .  ^  2  1 

costs  Q(xN,rN)  that  are  piecewise  -quadratic  ip  x  i.with  x  ,  x 

and  x^  terms);  these  nonnegative  costs  may  have  constant  pieces, 


linear  pieces  and  quadratic  pieces  that  are  concave-up 


or  concave-down 


(isL. 

U2 


We  call  this  the  jump  linear  piecewise  quadratic  (JLPQ)  control 
problem.  Our  study  of  this  class  of  problems  is  motivated  by  two 
factors : 


.  The  solution  of  the  JLPQ  control  problem  is  a 
necessary  step  in  the  extension  of  the  JLQ 
solution  to  systems  having  additive  input  noise 
and  more  general  x-dependent  form  transition 
probabilities;  we  will  use  the  results  of  this 
chapter  in  Chapter  9. 

.  The  JLPQ  formulation  broadens  the  range  of 
problems  that  can  be  addressed  by  the 


In  particular,  the  JLPQ  formulation  includes  x-operating  costs 
that  are 


.  constant  in  x 

eg:  Q(x,j)  =  100 

.  piecewise-cor.stant  in  x  with  discontinuities 

'100  j  x | >  10 
B(x'j>  =|o  W<  10 


.  piecewise-quadratic  in  x  with  concave-down 


pieces 

eg:  Q (x, j ) 


1*1  >  *5 
- . 5<x<0 

0<x<.  5 


Example  problems  with  these  kinds  of  x  costs  will  be  examined 
in  this  chapter. 

The  basic  structure  of  the  optimal  controllers  for  the 
JLPQ  problem  is  similar  to  those  for  the  JLQ  problems.  The 
optimal  expected  costs-to-go  are  piecewise-quadratic  and  the 
control  laws  are  piecewise- linear  in  x^,  in  each  form.  The 
derivation  of  the  JLPQ  solution  uses  the  same  idea  that  was  used 
in  Chapter  5: 


We  break  up  the  JLPQ  problem  into  constrained 
subproblems  that  are  easier  to  solve,  and  then 
we  compare  these  subproblera  solutions  to  deter¬ 
mine  the  optimal  controller. 


At  each  time  step  k,  the  control  problem  involving  the  search 
for  V^x^/r^j)  is  transformed  into  the  comparison  of  many  cons¬ 
trained  -in-x^+^  JLQ  control  problems  with  x-independent  form  trans¬ 
itions  and  quadratic  x-costs.  One  such  constrained  problem  arises 
over  each  interval  of  x^_+^  values  having 

.  constant  form  transition  probabilities 

p(j'i;Xk+1)  (  Vi  S  Cj} 

.  a  quadratic  expected  cost-to-go  (with 

Vi-  Vi  and  =<*1  terms>  - 

WVi'V11  £ot  111  iecj 

2  1 

.  a  quadratic  x-cost  (with  x^^,  x^+1 

and  x£+1  terms) ,  Q(xk+i/rk+1=i)  for  all  it  Cj  . 

The  number  of  costs-to-go  that  must  be  compared  at  each  stage, 

and  the  number  of  pieces,  m^(j),  in  the  optimal  expected  cost-to-go 

V,  , (x,  , ,r  =j)  may  grow  at  a  faster  than  linear  rate  with  the 
k+1  k+1  k+1  - 

number  of  form  transition  probability  pieces  and  x-cost  pieces 
(unlike  the  JLQ  problem  of  Chapter  5) .  The  "piecewise"  structure 
of  the  optimal  expected  costs-to-go  and  control  laws  for  the  JLPQ 
problems  of  this  chapter  is  caused  by  both 
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and 


.  the  piecewise-constant  nature  of  the  form 
transition  probabilities  (as  in  Chapters 
5-7)  . 

.  the  piecewise-quadratic  nature  of  the 
x-costs. 


We  develop  in  this  chapter  a  recursive  procedure  for  the 
determination  of  the  optimal  expected  costs- to-go  and  control  laws 
when  the  system  is  in  each  form.  This  procedure  can  be  done  off¬ 
line,  in  advance  of  system  operation.  It  is  carried  out  by 
pursuing  a  sequence  of  computations  and  comparisons  that  are  des¬ 
cribed  in  a  flowchart.  This  solution  procedure  is  a  generalization 
of  the  algorithm  of  section  7.2.  Certain  modifications  are  neces¬ 
sitated  by  the  qualitative  controller  properties  which  result  from 
the  piecewise  nature  of  the  x-costs  Q(x^+1, rk+1)  a™*  ®T^XN,rN^‘ 
Although  the  basic  idea  of  this  chapter  is  simple,  the 
deviation  and  presentation  of  the  general  result  involves  un¬ 
avoidably  complicated  notation  and  "bookkeeping"  problems.  For  this 
reason  this  chapter  has  been  organized  as  follows: 

1.  In  Section  8.2  the  general  JLPQ  problem  is  formulated. 

2.  In  Section  8.3  we  solve  for  the  last- stage  controller 
for  four  JLPQ  control  problem  examples,  and  we 

we  compare  these  results. 
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3.  Guided  by  the  intuition  gained  from  these  examples, 
a  general  one-step  solution  procedure  is  developed 
in  Section  S.4.  We  state  and  prove  Proposition  3.1 
which  is  a  generalization  {to  JLPQ  problems)  of 
Proposition  5.1. 

4.  In  Section  8.5  we  establish  a  number  of  qualitative 
properties  of  the  JLPQ  controller  (essentially 
generalization  of  the  results  of  Chapters  5  and  6). 

These  results  sure  then  used  to  develop  the  solution 
algorithm,  which  is  presented  in  the  flowcharts  of 
figures  8.5-8.12. 

5.  In  Section  8.6  we  illustrate  the  use  of  this  algorithm 
by  solving  for  the  k**(N-2)  controller  for  the  problem 
of  example  8.2. 

From  the  study  of  the  optimal  JLPQ  controllers  developed  here 
we  can  gain  additional  insight  into  the  structures  of  controllers 
that  use  active  hedging,  and  into  the  qualitative  effects  of  their 
control  actions.  The  algorithm  of  Section  8.5  provides  the  basis 
for  the  approximate  solution  of  the  scalar  JLPc  ( jump-linear- 
piecewise  -convex)  control  problems  of  the  next  chapter.  These 
problems  have  x-costs,  form  transition  probabilities  and  additive 
input  noise  densities  that  are  piecewise-convex  or  concave  in  x.  They 
can  be  solved  approximately  using  the  algorithm  of  section  8.5  and 
certain  approximations  motivated  by  the  qualitative  results  of 


this  chapter. 


The  grid  points  (s)  may  be  different  for  each  pair  (i,j)€  MxM. 
For  all  s=l,  2, . . .  , 


X.  .  (s)>  0 


M 


y  x, . (s)=i 

«  T  "1 


j=l  13 


for  each  i ,  j  e  M 


for  each  i  €  M  . 


We  assume  (as  in  Part  III)  that  the  state  (x^r  )  is  perfectly 
observed  at  each  k.  The  problem  is  to  find  the  optimal  control 
laws. 

\  =  VX0 . V  ro'*'"rk) 


that  minimize  the  cost  criterion 

r 


( a-1  f\R(rk}  + 

(xn,r  )  =  E  j  k=k0L 


'k  0  0 

0 


5'Vi'W 


+  Q_(x, r) 
T  N  N 


(8. 


where  the  expectation  is  over  {r  }. 

K0 

As  in  the  JLQ  problems,  of  Chapters  5-7  we  assume  that  the 
penalty  on  the  control  magnitude  is  quadratic.  It  is  assumed  that 


R(j)>  0  for  each  j  e  M 


I 


(8. 


The  x-operating  costs  Q(x,j)  and  terminal  costs  QT(x,j)  make 
the  above  problem  formulation  more  general  than  in  Part  III.  We 


assume  here  that  for  each  j  €  M,  Q(x,j)  and  QT(x,j)  are  piecewise 


quadratic  functions  having  y3  and  rp  pieces,  respectively: 


Q(x 


,j)  =  (t) x2  +  S3(t)x  +  p3 


(t) 


if  y3(t-l)<  x  <u3  (t) 


(8. 


— 1 

t=l, . . . ,y 


QT(x,j)  =  4(t)xJ  +  H^,(t)xN  +  G3(t) 


(8. 


if  n3 (t-i)<  x  <  n3 (t) 

N 


t— i , . . . , n 


We  also  assume  that 


Q(x,j)>  0 

QT(x,j)2l  0  for  all  x  . 

Note  that  (8.9)  requires  that  the  endpiece  x-costs  have 

K3(l),  K^T?),  Qj(l),  Qj(y?-1)>  0  (8.9) 

for  all  j  €  M. 


The  term  Q  (x.,r  )  in  (8.5)  is  a  terminal  cost  changed  in  addition  to  the 
T  N  N 

time- invariant  x-operating  cost  Q(xN,rN).  Since  {  (x^r^)  :  k  =  kQ, —  ,N)} 
is  a  Markov  process  we  need  only  consider  feedback  laws  of  the  type 

°k  ‘  VW  • 

Defining  the  expected  cost-to-go  v^x^r^)  as  before  and  applying 
dynamic  programming  from  finite  terminal  k=N,  we  have  the  relationship 


:(W 


r 


min  X 

\  ( 


Q(Xk+l'rk+l} 

+  Vi(Vi'W 


(8.10) 


for  k=N-l,N-2, . . . ,  kQ 


where 

Vwj>  ■  9t<W3) 

with 

nM(j)  =  n3 

and 

63(t)  =  nj(t)  t=l, . . . , rp-1 

N 

from  which  we  can  (in  principle)  solve  for  the  optimal  controls 


In  the  next  section  we  will  solve  the  last  stage  control  problem 


(k=N-l)  for  several  example  problems  that  satisfy  (8. 1) - (8. 10) . 


In  this  section  we  examine  several  example  problems  that  satisfy 
(8.1)- (8.10) .  We  solve  for  the  last-stage  optimal  controller  (i.e., 
at  time  k=N-l)  for  these  examples  from  first  principles.  These 
controllers  sure  then  analyzed  and  compared.  They  will  provide  insight 
regarding  the  solution  of  the  general  JLPQ  problem  later  in  this 
chapter.  All  of  the  examples  of  this  section  are  variations  of  the 
following  control  problem:  consider  a  system  with  M=2  forms  where 


r  1/4  |x|<  i 

p(l/2:x)  =  |  (8.12) 

(  3/4  | x | >  1 


qt(W2)  =  1000 


(8.14) 


and  2(Xk+-j/rk+2/=D  is  piecewise-quadratic  in  and  satisfies 

Q (xk+1'rk+1=l) >_  0  at  each  x^.  In  this  problem,  if  the  system  fails 
(jumps  into  form  r=2)  ,  it  stays  there.  There  is  no  repair  possible. 

In  form  r=2  the  value  of  x  does  not  change;  a  terminal  penalty 
of  Q  (x^r  “2)=1000  is  incurred.  Clearly  in  form  r=2. 


vw2)  • 1000 

Vw2)  =  0 


(8.15) 


at  all  times.  That  is,  if  the  system  fails  then  the  optimal  strategy 
is  to  turn  it  off  (by  setting  u=0)  . 

In  this  section  we  will  consider  four  different  piecewise-quadratic 
x-costs,  Q(xk+1,rk+1=l)  .  Recall  that  the  conditional  expected  cost-to-go 

VxJrN-i=1)  is  defined  by= 


VXN>rN-l=1)  =  E 


which  for  (8. 11) - (8. 15) 


r 

Q(XM'rM)  +  VM(XM,rM) 

N  N  N  N  N 

rH=1  | 

is  given  by 


(8.16) 


VVrM-l"1)  =  PU,l:xNHQ(xN,rN)] 


(8.17) 


p (1, 2  sx  ) 1000 
N 


V  (x,  r,  ,=1)  is  a  piecewise-quadratic  function  of  x  having 
N  N  N— 1  N 


^  pieces: 


v  (X  |r  =1)  =  V  (t)  =  i  X  K  (t)  +  x  H  (t)  +  G  (t) 


N  N'  N-l 


N 


N  N 


N  N 


N 


f0r  XN  6  VU  =  (YN(t_1) ,YN(t))  * 


t=l  ,  •  •  •  ,  tp, 


N 


(8.18) 


Our  task  is  to  find  the  u  .  value,  as  a  function  of  x  ,,  which 

N-l  N-l 


nu.ninu.zes 


N-l 


^  u  2  +  V  (x  I  r  =1)  \ 

N-l  N  N1  N-l  ( 


mm  mm 

t=l,..rS  UN_lS.t. 


,U2  +X2K  (t)+x  H  (t)+S  (t) 
N-l  N  N  N  N  N 


min  V  -l|t) 

t=l, . . ,s 


(8.19) 


In  (8.19)  we  sure  following  the  basic  idea  of  Part  III: 

We  convert  the  control  problem  (8. 11) -  (8. 15)  into  the 

comparison  (for  each  XN_^)  of  the  solutions  of  a  set  of 

constrained- in-x..  JLQ  subproblems.  Each  subproblem 
N 

corresponds  to  driving  x„  into  one  of  the  domains  A  At) 

N  N 

of  the  V, (x  I r  =1)  pieces  (as  in  (8.18)). 

N  N  N-i 

We  begin  with  a  single  one-piece  concave  upwards  quadratic  x-cost. 
This  problem  is  solvable  by  the  algorithm  of  Section  7.2.  We  present 
it  here  for  comparison  with-the  other  examples  of  this  section. 
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In  Figure  8.1  and  all  of  the  graphs  of  this  chapter,  the  scales  are 
distorted  so  that  the  behavior  of  the  functions  at  joining  points  is 
highlighted. 


Q(x,r=l)  and  k*(N-l)  Solution  for  Example  8.1.  Not 
Drawn  to  Scale  so  as  to  Emphasize  Behavior  at  Joining 


Points 


k.'- 


or  else  the  slope  of  V..  ,  (x„  _,r  =1)  decreases 

N- 1  N-l  N-l 

discontinuous ly  (as  at  x,r  ,=  <5.,  ,  (1)  =  -26.238  and 

N“  1  N-l 

x^.^  =  d^_^(4)  =  26.238).  (Proposition  5.1) 


3.  The  optimal  control  law  u„  , (x„  , ,r  =1)  is  a  continuous 

N-l  N-l  N— 1 

nonincreasing  function  of  except  at  joining  points 

where  V„  , (x„  ,  ,r  =1)  has  a  discontinuous- slope;  at 

such  points  (x.T  ,  *  +26.238  in  Example  8.1),  the  control 
N-l  — 

law  increases  discontinously.  (Proposition  5.3). 

4.  The  optimal  controller  cam  hedge-to-a-point  only  to  the 

low  cost  side  of  a  VN(x^  |rN_j_=l)  discontinuity .  These 

arise  from  form  transition  probability  discontinuities. 

In  example  8.1  these  are 

*  ‘1+  '  r  • 

(Proposition  5.2). 

5 .  The  mapping 

XN-1 1 — *  XN (XN-1 ' rN-l=1) 


is  monotonely  nondecreasing.  It  consists  of  five  line 
segments : 

.  a  segment  with  positive  slope  in  each  region 

of  x„  ,  values  where  am  "unconstrained"  cost 
N-l 

(driving  x„,  into  the  interior  of  one  of  the 

V., (x  I r  ,=1)  piece  domains)  is  optimal. 

N  N  N- 1 


* 
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a  constant  line  segment  for  each  x 


region  from  which  there  is  hedging-to-a- 


point 


-26.238  <x„  <  -1.75 

N—  1 


+1.75  <x  <  26.238 
N-l 


V  -1 


■Si’1  ) 


in  example  8.1. 

(Proposition  5.3) 

There  are  regions  of  x  avoidance  associated  with  (and 

N 

only  with)  each  joining  point  where  the  slope  of 

V  n  (x  ,  .  ,r  =1)  decreases  discontinuously.  These  are 
N-l  N-l  N-l 


(-20.9904,-1)  and  (1,20.9904)  in  example  8.1. 
(Proposition  5.3).  | 

We  will  now  present  examples  of  the  problem  (8.11) - (8 . 18)  where  not 

all  of  the  above  facts  will  hold,  due  to  the  structure  of  the 

x-cost  Q(x  ,r  ). 

N  N 


Example  8.2:  (Hedging  to  discontinuities  of  the  x-cost  Q(x,r)) 
Let  the  x-cost  be  piecewise-constant  in  x: 


9(>w 


'Vial)  ■  I 


100  >  -5 


This  cost  is  shown  in  Figure  8.2(a).  The  control  problem  (8.11)- 
(8.15),  (8.21)  corresponds  to  a  situation  where  we  want  the  x  process 


to  be  inside  a  certain  interval  (i.e.,  inside  (-.5, .5)),  and  we 
penalize  equally  any  x  value  outside  of  the  desired  interval. 

The  conditional  expected  cost-to-go  in  (8.13)  for  this  problem 
is 


N  N  N-l 


( 

775 

if 

X* 

A 

1 

H* 

325 

if 

-1  "  XN<- 

250 

if 

'•5<xn<-5 

325 

if 

.5<xN<1 

775 

if 

1  < 

(8.2 


Solving  the  first  of  the  constrained  subproblems  in  (8.19)  we  obtain 
the  following: 

for  x  6  A  (1)  «  (-°°,-l)  , 

N  N 


VN-l(XN-l'rN-l  1'1) 


j  min 
i  u 

1  N-l 


N-l 


+  775) 


*•*  *»  <’1 


(8. 


Differentiating  v  .  with  respect  to  u  and  setting  to  zero,  we 
N— 1  N- 1 

find  that  (8.23)  is  minimized  by  U.T  .=0  with  resulting  cost  V  =775 

N-l  N-l 

if  x  =  x,  ,  <-l.  If,  however,  we  have  x„  >  1  then  the  constraint 
-  N  N- 1  N-l 

in  (8.23)  is  active.  Since 


3  VN-l(XN-l'rN-l-1l1) 


>0 


for  any  fixed  xN_1  , 


we  minimize  (8.23)  for  ^<-1  by  driving  to  -1  ,  using 


VN-l(XN-l'rN-l=1  3)  = 


Vi*Vi*!S0,2S 

if  x 

! 5 

r3'U  = 

250 

if  - 

’  5<xn-i<  * 5 

r3»  R  __ 

X,2  -X  ,+250.25 
N-l  N-l 

if 

•5<Vi 

3,L 

-vr-5+ 

if 

Vi<  -5 

3,U 

0 

if 

ii 

•-I 

ro 

'Vi*'r 

if 

•5<Vi 

i,  for 

eaoh  Vi 

V. 


N-l(XH-l'rN-l  Li4)  =  VN-l("XN-l,rN-l=1l2) 
VN-l(XN-l'rN-l=1l 5)  =  VN-l(“XN-l'rN-l=1^1) 


Vi(Vi'Vi=1i4)=  ■vi(’Vrvr1i2) 

Vi(Vi'Vi=1l5)  =  -Vi(-Vi,rN-i=lil 


Performing  the  comparison  in  (8.19)  at  each  x  .  we  obtain  the 

N-l 


solution  for  the  last  time  stage  of  this  example,  as  listed  in  Table  8.2 


and  shown  in  Figure  8.2. 


ViVi'Vi-11 


H'Vi'Vr1  VVi-Vi 


X  ,  <-23.413 
N-l 


-23.413  <  x  <-.5 
N-l 


x  +x  +250.25 
N-l  N-l 


■vr5 


-5<Vi<,s 


.  5<x  <23.413 

N-l 


Vi'Vi*!S0'2S 


-Vi+-5 


23.413  < 


TABLE  8.2:  Optimal  controller  from  (x„  ,  ,r..  ,  =1)  in  Example  8.2. 
"  N—  1  N- 1 


For  small  lxN_^|  ( | ^ j  <- 5)  the  optimal  controller  spends  no  control 

to  move  the  x  process,  since  it  is  already  in  the  domain  where 

Q(x,r=l)=0.  For  large  |xN  1|  ( | x^_ ^ |  >  23.413)  the  optimal  control  is 

also  zero.  For  23.413>|x„  .1  >.5  however,  the  optimal  strategy  is  to 

N-l 

*  i 

exert  control  so  as  to  drive  x„  inside  the  lowest  V  (x„  r  =1) 

N  N  N  N—  1 

interval  in  (8.22). 

Note  that  it  is  never  optimal  in  this  example  to  drive  Xj^  into  the 
intervals  ( (—1,— .5)  and  (.5.1))  where  the  Q(x,r=l)  cost  is  high  but  the 
failure  probability  p(l,2:x)  is  low. 

Comparing  examples  8.1  and  8.2  we  note  that: 


•  The  optimal  controller  hedges  to  the  points  xN  =  - . 5 ,  .5  in 
example  8.2. 


These  values  of  xN  are  discontinuities  of  vN(xN  |rN_-j=l)  ^8-22^ • 

However  the  optimal  hedging  is  not  to  form  transition  probability 
discontinuity  locations  but,  rather,  to  discontinuities  of  the 
x-cost  Q(x,r=l) . 

•  The  mapping  x.T  ,  i _ •  x„  ,  (x„  _,  r  =1)  is  once  again  raono- 

N- 1  N— 1  N- 1  N- 1 

tonely  nondecreasing.  The  regions  of  x^  avoidance  for  example  8.2 
are 

(-23. 413,-. 5) , (5,  23.413). 

As  in  Example  8.1,  each  region  of  avoidance  is  associated  with  a  joining 

point  of  V,  , (x.„  , ,r  =1)  where  the  slope  decreases  discontinuously. 
N-l  N-l  N-  i 

In  Section  8.6  the  optimal  controller  at  time  Jo=  (N— 2)  will  be  obtained 
for  this  example  using  the  solution  algorithm  that  is  developed  in 
Section  8.5.  Q 

The  next  example  x-cost  that  we  will  examine  involves  quadratic 
pieces  that  are  concave-up  as  well  as  concave  down.  This  leads  to  a 
controller  that  has  somewhat  different  qualitative  properties  than 
those  that  we  have  examined  previously.  In  particular,  the  following 
example  shows  that  for  the  JLPQ  controllers  of  this  chapter , 

.  active  hedging-to-a-point  can  occur  to 
points  other  than  conditional  cost. 


j) ,  discontinuities. 


Example  8.3:  (x-operating  cost  having  concave-down  quadratic  pieces) 


Let  Q(xk+1,rk+1=l)  be  as  follows: 


x,  ,2 
k+1 


s,Vi'Wu  ■ 


■  Vi  •  Vi 


2 

*k+l 


w  -5 

-5  Vl<0 

0  <  W  *5 

•5<xk+l 

(8. 


This  cost  is  shown  in  Figure  8.3(a).  Note  that  the  inner  two 
pieces  are  concave-down. 

The  conditional  expected  cost-to-go  for  the  problem  (8.11)- 
(8.15),  (8.24)  from  form  rN_1=1  is 


Vv 


rN-l=1) 


( . 25) x2  +  750 

N 

(. 75)x2  +  250 

(.75)  I-x^-xJ  +  250 
N  N 

(.75)  [-x2+xl  +  250 

N  N 

(.75)x2  +  250 

(.25)x2  +  750 
N 


X  <  -1 
N 


-K  ^  <  -.5 


-. 5  <  x  <  .0 
N 


0  <  x  <  .  5 
N 


•5<  V  1 
1  '  ^ 


(8. 


We  note  in  passinq  that  that  V..(x„|r..  .=1)  is  continuous  at 

N  N  N“  1 

x  =  +  .5  and  x._=*Q.  It  is  discontinuous  at  x..  =  +1. 


I 

Note  that  V  (x  |  r  . =1)  is  not  dif f irentiate  at  any  of  its 
N  N  N- 1 - - 

joining  points  in  (8.25)  (i.e.,  at  +1,  +.5,  0). 

Solving  the  constrained  subprobleros  in  (8.19)  we  obtain  the 
following : 


and 


ViVrvr1 


=iD  = 


v1-0  -  .2  ^  ♦  750 


Vi<-LJS 


,1'l-Vi,V*,SU5  '1,25<Vi 


vi(Vi'Vrl|u 


ul,°  ■  -2  Vi 


l.L  . 

u  =  -1  -X. 


N-l 


Vi^1-25 


-1. 25<x, 


N-l 


Vi'Vi'vr1!21 


V2'L-x2,  ,+2x„  +251.7286 

N-l  N—  1 

V2,U=.4285x2  .  +250 
N-l 

v2'R“Vi+Vi+250*4372 


w1-75 


■L75<Vi<- 


- .  875<x. 


N-l 


Vl(XN-l 


x  <-1.75 
N-l 


.875 


-.4285x„ 


-1. 75<x„  <-.875 


By  the  symmetry  of  the  problem,  for  each  x 


Vi(V1'Vi“1‘5) 


ViVi'Vi31'61 


V1(_XN-1,rN-l=1'1) 


Vi(Vi'Vi=1l5)  = 


-UN-1 (~XN-l'rN-l=1' 2) 


Vi(Vi'Vi=1l6) 


"UN-l(“XN-l'rN-l-1l1) 


For  x„  €  A, (3)  and  x  „  €  A  (4),  however,  the  associated  subproblems 
N  N  N  N 

are  different  than  any  we  have  examined  here  previously.  Specifically, 

A  . 

V  (xjr  =1)  is  concave-down  over  A., (3)  and  A. (4).  Consider  the 
N  N  N—  1  N  N 

constrained  subproblem  V,  .  (x„  , , r  _ «1 1  3) t 

N-l  N-l  N-l 


Vi(xn-i'Vi=1'3)  = 


Vi  s-fc* 
•5<V° 


u2  -.75x2 
N-l  N 

+  250 


(8. 2€ 


Differentiating  with  respect  to  UN_^  and  setting  to  zero  we  find 
that  (8.26)  is  minimized  by 


Vi  -  3Vi  + 1-50 


3  V  . (x  ,r  =13) 
since  N-l  N-l  N-l  1 

<3Vi)2 


=  . 5  >  0  .  The  resulting  cost  is 


V  =  - 3x  .  -  3x._  .  +  249.4375  . 


This  solution  is  only  valid,  however,  if  the  constraint  in  (8.26) 
is  inactive.  That  is,  if  the  resulting  x^  satisfies 


.5  <  x  =  4x  +  1.5  <  0, 

N  N-l 

which  is  the  case  when  -.5<x„  ,  <-.375.  Otherwise,  we  must  drive 

N— l 

to  the  best  x„  value  in  (-.5,0).  Note  that  for  each  fixed  x„  ,  value 
N  -  N-l 

we  can  write  (8.26)  as  a  minimization  over  x,  values  in  (-.5,0): 


V  i  (xM  i,rM  I=sll3)  =  111111  "*75xm  *-75x m+250 

N-l  N-l  N-l  -  _  N  N-l  N  N 


•5<V° 


Since 


min  {(.25)xf  +(-2x  -.75)x„  +  (x’f  +2 

-.5<xn<0  N  N_1  N  N-L 


(aXN} 


2  Vl(Vl'Vl=1'3)  =  .25  >  o  , 


the  optimal  strategy  is  to 


make  x^j  =  -.5  if  xN_i  <_.5 


make  x„  =  0 
N 


if  -.375  <x 


Consequently  we  obtain 


(v3,L  ■  Vi*ViHSM15 
W(Vi'Vi-ll3>  -\v3,U-  -3Vi-3Vi+249- 


4375  Vi <  - 


4375  - .  5<x.. 


UN-l(XN-l'rN-l_1'3)  "  \  uJ,U  =  3x,  +1.50 

I  N-l 

(  3,R  _ 

U  “  ~XN-1 


■ . 5<x  .<-.375 

N-l 


375<x. 


N-l 


and,  by  the  symmetry  of  the  problem, 


N-l  N-l  N-l 


N-l  N-l'  N-l 


=  V  , 

(— x  , 

r 

-1 

4) 

N-l 

N-l 

N-l  1 

—  -u 

(-X  .  .  , 

r 

=ll 

4) 

N-l 

N-l 

N-l  ' 

Performing  the  comparison  in  (8.19)  for  the  six  constrained  sub¬ 
problem  solutions  (at  each  XN  we  °bta^n  the  last  time-stage  solu¬ 
tion  of  this  example,  as  listed  in  Table  8.3  and  shown  in  Figure  8.3. 

In  Figure  8.3(b)  we  see  that,  as  in  earlier  examples,  the  optimal 

cost  V  . (x„  ,r  =1)  is  piecewise-quadratic  in  x„  .  .  This  example 
N— 1  N— 1  N— ±  N-l 

differs  from  those  considered  earlier  in  that  some  of  these  pieces 
have  32V:l/(3xn-1)2<  °* 

The  only  nondif ferentiable  points  of  V  , (x„  ,  ,r  .)  are  at  x„  =  +26.238 

N-l  N-l  N-l  N-l  — 

At  the  other  joining  points  of  V„  (x  _,r„  =1), 

N— 1  N“1  N— 1 


‘Vl'Vi'Vi-11 


3x 


N-l 


+1.5  at  x„  =  +  1.75 
N-l  — 

+.  75  at  x„  ,=  +  .875 
N-I  — 

0  at  vr  ±-5 

+.75  at  x„  =  +.375 
N-l  — 


At  x„  *  +.5,  V..  ,  (x„  ,  ,r  =1)  has  inflection  points. 
N-i  N-l  N-l  N-l 


515 


if 

1 

Vi'Vi'Vi'11 

“h'Vi'Vi-11 

Wl'rN- 

x  <-26.238 

N-l 

2 

.  2x  +750 

N-l 

j 

"•2xN-i 

•sVi 

-26. 238<x ,  <-1.75 

N-l 

Vi*2Vi*!iL7! 

-xN-r1+ 

+ 

-1 

-1. 75<x  , <-.875 

N-l 

.  4285x5  ,+250 

N— 1 

- . 4285x  ; 

N-l  i 

*5715xn-i 

-.875<x„  ,<-.5 

N-l 

Vi*Vi+250-44 

“xN-r-5 

-.5 

-.5<x„  ,<-.375 

N-l 

3Vi+L5 

4xn-i+1-5 

- . 375<x  ,<.375 

N-l 

Vi*!5° 

*XN-1 

0 

. 375<x  ,  <.5 

N-l 

■^S-l*3*!)-!*249-44 

3xn-i~1-5 

^N-r1-5 

.  5<x„  ,<.875 

N-l 

Vi'Vi*250-44 

~XN-1+ ’ 5 

.5 

. 875<x  <1.75 

N-l 

. 4285xf  ,+250 

N-l 

-•4285xn-i 

. 5715x  , 

N-l 

1. 75<x  <26.238 

N-l 

X2  , -2x  +251.73 

N-l  N-l 

-XN-1+1_ 

i‘ 

26.238<x  , 

N- 1 

"'2xn-i 

■aVi 

TABLE  8.3:  Optimal  controller  from  (x„  ,  ,r  =1)  in 
-  N- 1  N-l 

Example  8. 3. 


As  in  earlier  examples,  the  optimal  control  law  is  a  piecewise- 


linear  function  of  x^^.  The  control  law  in  example  8.3  does  not 

decrease  between  all  joining  points  (unlike  examples  8.1,  8.2  and 

any  problem  in  Chapters  5-7  with  b(j),  a(j)>  0).  The  slope  of 

uN_^ (*N_^»rN_  »1)  is  positive  in  the  intervals  (-.5, -.375)  and  (.375, .5) 

but  negative  everywhere  else  it  exists  (see  Figure  8.3(c)). 

The  optimal  controller  in  example  8.3  hedges  to  the  points 

x  =-l+,  x  =-.5,  x  =.5,  x  =1  and  x  =0.  Two  of  these  values 
N  N  N  N  N 


+  •*  /v  . 

(x  =-l  ,1  )  are  hedging  to  the  low  cost  side  of  a  V  (xjr  =1)  dis- 
N  N  N  N-l 

continuity,  as  in  previously  studied  examples. 

A.  . 

However,  V  (x,  r  =1)  is  not  discontinuous  at  x=+.5  and  x  =0, 
N  N  N-l  -  N  —  N 

yet  we  hedge  to  these  values.  From  (8.25)  note  that  x  =+.5,  0  are 

N  — 

the  boundaries  of  Vn{xn|  ^-l”^  pieces  that  are  concave  .  Necessary 

conditions  for  hedging-to-a-point  will  be  stated  in  Section  8.5 
(corollary  8.3). 

As  in  example  8.1  and  8.2,  we  see  from  Figure  8.3(d)  that  the 
mapping 


N-l 


Wi'Vi*11 


is  monotonely  nondecreasing.  Note  that  the  regions  of  avoidance 


(-20.99,-1)  and  (1,20.99) 

are  associated  once  again  with  x„  .  values  when  the  slope  of 

N-l 

V..  ,  (x.,  ,  ,r  =1)  decreases  discontinuously  (i.e.,  at  x  =+26.230). 
N-l  N-l  N— 1  N 


-8 


There  are  no  regions  of  ^  avoidance  associated  with  hedging  to 

=  I*5  or  x^O.  □ 

The  next  example  illustrates  additional  issues  that  the  modified 
solution  algorithm  must  cope  with  in  solving  the  JLPQ  problems  of 
this  chapter. 


Example  8,4:  (Subproblem  with  no  unconstrained  minimum) 

Let  the  x-cost  be  piecewise-quadratic  in  x  with  concave-up 
endpieces : 


s'Vi'W11  ' 


/■*;* 

x,  <- .  5 

k+l 

-2vrL-5Vi 

‘•5<Xk+l  ° 

■2Vi,L5Vi 

0  -5 

2 

•5<Xlc+l 

l*k+l 

(8- 


This  cost  is  shown  in  Figure  8.4(a).  The  conditional  expected  cost- 
to-go  for  the  problem  (8. 11) - (8. 15) ,  (8.27)  from  form  r  ^*1  is 


VxNlrN-i=1)  =  < 


( .25) x^+750 

N 

v1 

(.75)xf+250 

N 

-1<V"'5 

(.75) [-2x^-1. 5x] +250 

N  N 

-5<V° 

(.75)  [-2x^+1. 5x^1 +250 

0<V.5 

( .  75) x^+250 

N 

•^v1 

(.25)x2+750 

Kx 

(8. 


For  lx  1  >.5,  V  (x  jr  =1)  in  (8.28)  is  the  same  as  that  of  example 
'  N 1  N  N  N- 1 

8.3  in  (8.25).  As  in  the  last  example,  V  (x  Ir  =1)  is  continuous  at 

N  N  N- 1 

xN=  +^.5  and  at  x^O.  It  is  discontinuous  at  x^  +1.  As  in  Example 

8.2,  Vn I  rN-l=^  n0t  d^^^erent^ai3]-e  at  xjj=  +■*■»  +.5,  0. 

Solving  the  constrained  subproblems  in  (8.19)  we  obtain  the  same 

V,  , (x„  .,r„  =l|t)  for  t=l, 2,5,6  as  in  Example  8.2: 

N-l  N-l  N-l 


Vi'Vi'Vi*1'11 


V1,U  =  . 2x2  +750 

N-*. 

vl,L  -  Vi+2Vi*751-25 


Vi'-1-25 


>-1.25 


ViVi’Vi*1 2) 


V2,L  »  X2  ,+2x„  +251.7286 

N-l  N-l 

2  fU  2 

V  =  •4285x  +250 

N— 1 

V2,R  =  X2,  +x„  +250.4375 

N-l  N-l 


Vi^1*75 


-1. 75<x  <-.875 

N-l 


-. 875<x 


VN-l(XN-l'rN-l  1'1) 


Vl(“Vl'Vl"1l6> 


Vl(XN-l'rN-l=l|2) 


Vi(*Vi'Vi=1'5) 


Consider  now  the  constrained  subproblem  V  (xM  .,rM  =13): 

N— 1  N— 1  N—  1 


ViVi'Vr1'31  =  n  rain , 

ux,  i8-*1 

N— 1 

-.  5<x  <0 


“li+-75<-2v1-5V 

+  250 


(8.29) 


ill 


Differentiating  twice  with  respect  to  u  we  obtain 

N—  1 


SvN-i(xN-i,rN-r1i3) 

3un-i 


=  2%-i  +  (-751(-2)(WVi) 


+  (.75)  (-1.5) 


3  VN-lCXN-l'rN-l  -3) 


(8Vi)' 


=  2-4 (. 75)  =  -1  <  0  . 


3Vi(Vi'Vi=1  3) 

Thus  for  u  ,  such  that  - » -  =  0  we  obtain  a 

Vi 

maxintum  instead  of  a  minimum  .  For  a  fixed  x„  .  value,  we  can  rewrite 

N—  1 

(8.29)  as 

2 

((x  -x  )  +  250  \ 

,  ,  i„.  5  N  N-l  I 

VM-1  (XM-1  'rM-1=1l3)  “  ^  <  \ 

N  *  N  1  N  1  r SQ  )  2  I 

•5<V°  (  +  (-75)  (-2x^-1. 5xn))  (8. 


(  -  .5x2  - (2x  +1. 125) x  ) 

min  )  N  N-l  N  ( 

-.5<x  <0  J  (  • 

I  +[xn-i+250] 


Since 


(9V 


2  VN-l(XN-l'rN-l"1) 


-1  <  0 


the  optimal  choice  of  x„  inside  A  (3)»(-.5,0)  is  on  a  boundary 

N  N 


(either  x^-,5  or  x  =o~)  for  every  x 


If  we  make  x  =.5  we  will  use  control 
N 


3,L  + 

UN-1  "XN-1  '•5 


and  incur  cost 


v3,L  -  ViV!S1'43,! 


If  we  make  x  =0  we  will  use  control 
N 


3, R  _ 

UN-1  “XN-1 


and  incur  cost 


V3'R  -  £  ♦  250. 


From  (8. 34)  -  (8. 35)  we  see  that 


if 


x  .<  -.4375  . 

N— J. 


Consequently  we  obtain 


VN-l(XN-l'rN-l=1l3) 


,3,L  •  Vi*Vit!SM3,s 

if 

XN-1 

■3,R  -  Vi*!S0 

if 

XN-1 

Vf-5* 

if 

XN-1 

UN-l(XN-l'rN-l=1'3) 


(8.3 


(8.3 


<-.4375 

<-.4375 

<-.4375 


if  x  ,>-.4375 


A  I 

This  is  because  (as  we  noted  before)  V  (x  Ir  =1)  is  continuous  at 

N  N  N-l 

x  *  +.5,  0.  Performing  the  comparison  of  subproblem  solutions  in 
N  — 

(8.19)  for  the  six  problems  here,  we  obtain  the  last  time-stage  solution 
for  this  example.  This  solution  is  listed  in  Table  8.4  and  shown  in 
Figure  8.4. 

Comparing  the  results  of  Examples 8. 3  and  8.4  we  see  that 

1.  Examples  8.3  and  8.4  have  the  same  solution 
at  k=(N-l)  except  for  . 375< | XN_^| <• 5 

3  U 

2.  In  Example  8.3  the  cost  VN_^ ^xN_i'rN_i=1^*v  in 

•375<|x,  , I <• 5  is  concave  down  (Figure  8.3(b)). 

N— 1 

In  Example  8.4  this  cost  piece  of  Vjj_i(xN-i<*N-i=l)  ®issing. 

2  R  3  L 

The  adjacent  cost  pieces  V  '  and  V  ' 


are 


x„  <-26.238 

N-l 

•  2x  +750 

N-l 

■*2xn-i 

-26.238<x„  .<-1.75 
N-l 

WV!iul 

-xN-r1+ 

“1-75<Vi<-'875 

. 4285x  2  +250 

N-l 

-4285xn-i 

-.875<x„  <-.4375 

N-l 

*Vf5 

~.4375<x  <.4375 

N-l 

ViHS0 

"XN-1 

. 4375<x  <.875 

N-l 

Vi'ViHS0-M 

~XN-1+-5 

. 875<x  <1.75 

N-l 

.4285x2  +250 

N-l 

-4285xn-i 

1. 75<x„  <26.238 

N—  X 

WV251*73 

*XN-l+f 

26. 238<X  . 

N-l 

.2x2  ,+750 

N— 1 

"•2xN-1 

•5715X, 

-.5 

0 


.5715x] 

l" 


•  8x 


N— 1 


N-l'rN-l 


TABLE  8.4:  Optimal  control  from  (x 


=1)  in  Example  8.4. 


optimal  over  (-.375,-. 5).  Cost  V  is  missing  in 
Figure  3.4(b)  because  it  is  never  valid. 

Here 


3  VN-l(XN-l'rN-l=1l3) 


<  0, 


(3Vi} 


hence  the  constraint  in  (8.29)  is  always  active. 

3.  In  Example  8.4  we  have  two  additional 

regions  of  xN  avoidance  / 

(-.5,0)  and  (0,.5)  . 

Although  it  is  optimal  to  drive  x„  to 

N 

exactly  zero  for  lxN_^l  <*4375,  it  is 

never  optimal  to  place  x„  near  zero.  __ 

N  O 

In  the  above  examples  of  JLPQ  problems,  the  last-stage  controllers 
exhibit  certain  qualitative  behaviors  that  are  not  manifested  by  the 
JLQ  problems  of  Part  III.  These  aspects  of  the  JLPQ  controller  must  be 
accounted  for  in  the  development  of  a  solution  algorithm.  We  list  them 
here  for  convenience: 


1.  Hedging  need  not  only  be  to  form  transition  probability 
discontinuities  in  JLPQ  problems.  It  can  be  to  conditional 
cost  V  (atjjr  discontinuities  that  arise  from  x-cost 

discontinuities  in  Oix^/^)  or  ®x^XN'rN^'  ^ExainP-'-e 

2.  Hedging- to- a- point  can  occur  to  points  that  are  not  dis¬ 
continuities  of  the  conditional  cost  V,  (x,  |r,  ,).  However, 

k  k  k-1 

^)  pieces 

are  concave  down.  (Examples  8. 3, 8. 4). 


these  points  are  the  boundaries  of  (x^ | r^ 


that 


3. 


The  optimal  control  law  u^te^r  ®j)  may  138  “iiscontinuous 
at  x  values  where  the  optimal  cost  V  (x  ,r  =j)  is  dif- 

K  K  JC  K 

ferentiable.  (Examples  8. 3,8. 4). 


4.  Hedging- to- a- point  x^=x  can  occur  without  an  accompanying 
region  of  x^  avoidance  (example  8.3). 

5.  Regions  of  x^  avoidance  are  associated  with  (and  only  with) 
values  where  u  .  is  discontinuous  (i.e.,  where 

Vk  1(xk  . »r^  ^=D  is  not  differentiable) .  Thus  there  is  no 

region  of  avoidance  associated  with  hedging  to  a  continuous 
point  of  V^(x  |r^  ^=1) .  (Example  8. 3, 8. 4). 


In  some  instances,  the  constrained  subproblem 
Vfc(xk,rk=j  1 1) ,  corresponding  to  driving  xJc+1  into  A^+1(t) 
/\  , 

where  (xk+1 I rk=j)  is  concave  down  in  A^+1(t),  may 
result  in  a  subproblem  controller  that  never  drives 
into  the  interior  of  A^+1(t) ,  for  any  x^  value. 


(Example  8.4,  but  not  example  8.3). 


In  the  next  section  we  will  solve  the  general  JLPQ  control  problem 
(of  Section  8.2)  for  one  time-stage.  This  result  will  then  be  used 
in  Section  8.5  to  construct  a  solution  algorithm  for  these  problems. 

The  examples  of  this  chapter  will  provide  insight  regarding  how  the 
solution  algorithm  of  Chapter  7  must  be  altered  for  JLPQ  problems. 


In  this  section  we  use  intuition  gained  from  the  example  problems 


of  the  last  section  to  solve  the  optimal  control  problem  of  Section  8.2 
for  one  time  stage.  As  we  indicated  earlier,  the  notation  and  "book¬ 
keeping"  becomes  quite  complex,  but  the  basic  idea  is  the  same  as 
illustrated  in  the  previous  section.  Inductive  application  of  the  one 
stage  solution  (backwards  in  time  from  finite  terminal  time  N)  then 
establishes  that  the  solution  of  problem  (8.1)- (8.10)  yields  optimal 
expected  costs-to-go  that  are  piecewise-quadratic  in  x  and  optimal 
control  law  that  are  piecewise-linear ,  for  all  forms  j  e  M  : 


Vk(xk,rk=j)  =  xkKk(tsj)  +  +  Gk(t:j) 

VW31  ■  'Vtsj):<k +  Vt:j) 


when 


5^ (t— 1) <  <  sj(t). 


where 


5k(1)<  6k(2><“,<  4<rak(j)"1) 


(8.36) 

(8.37) 


are  the  points  where  the  pieces  of  (x^, j)  are  joined  together  (the 
boundaries  of  the  x.  intervals)  and 


The  proof  of  the  one-stage  optimal  controller  result  is  cons¬ 
tructive.  It  suggests  an  algorithm  for  the  recursive  determination 
of  the  optimal  expected  costs-to-go  and  control  laws  for  this  problem. 
An  efficient  algorithm  for  this  determination  of  the  optimal  controller 
(8. 36) -(8. 37)  is  presented  in  flowchart  form  in  Section  8.5. 

The  one-stage  solution  result  is  as  follows : 


(1) 


(2) 

(3) 


At  time  k=N,  conditions  C i )  -  (ii)  ~cire  clearly  satisfied.  If  we  consider 

the  sum  of  the  x-terminal  cost  qT(xN,rN)  and  x-operatinq  cost  Q(xN,rN) 

to  be  the  last-stage  x-operating  cost  (that  is,  we  think  of  VN(xN,rN)=0) 

then  (iii)  is  also  satisfied  at  time  k=N.  Thus  this  proposition  can 

be  applied  inductively,  backwards  in  time  from  k=N.  Equations  for  the 

iterative  computation  of  the  quantities  m^(j),  K^ftjj) ,  H^tttj),  G^ftrj) 

and  {6-1  (i)  :  a*1,  .  .  .  ,mk  ( j)  -1  }  for  each  i,je  M  are  listed  in  Appendix 
k 

D.l.  These  equations  are  developed  in  the  proof  of  Proposition  8.1, 
which  constitutes  the  remainder  of  this  section  (with  some  details  in 
Appendix  D. 2) . 

Proof  of  Proposition  8.1; 

For  each  form  r^  =  j e  M,  the  minimization  in  (8.10)  subject  to 


Vk^Xk'rk=^  P^ecewise-cluac*rat;i-c  an^ 
u^(x^,r^=j)  is  piecewise-linear  (as  in  (8.36)- 

(8.37)),  each  having  m^j)  pieces  joined 
continuously  at 

{<5jj.(l)<  6^ (2)  <. . . <  6^(mk(j)-l)} 


f  (t :  j) 


Hk(t:j)/2 


>  0 


^H]c  (t :  j )  /2  Gk(t:j) 


at  t=l  and  t=mk(j) 


jVvvjj 

3xk 


is  either  continuous  or  decreases 


discontinuous ly  at  the  joining  points 

(6^(1) ,  —  ,5^(mJc(j)-l)  }  . 


□ 


(8.1)-(8.9)  is  converted  into  the  comparison  of  a  finite  set  of  constrained 
-in-Xfc+i  JLQ  problems,  each  with  x-independent  forms. 


This  is  done  conceptually  via  the  following  steps: 


Step  1:  Obtain  a  composite  partition  of  x^+1  values 

from  the  partitions  associated  with  the  x-costs 
Q(xk+1»rk+1=i) ,  with  the  form  transition  probabilities 
p(j,i:x)  and  the  expected  costs-to-go 


VVi'Vi'11  £or  each  1  e  cj  • 


The  composite  partition  grid  points  are  the  boundaries 
of  the  pieces  of  the  piecewise-quadratic  function 

an  Vi’ : 


Vk+l(xk+l^rk  j) 


E 


'WVi’Vi1 
*■  2<Vl-Vl’ 


(8. 


This  step  is  similar  to  step  1  of  Section  5.4, 
except  that  here  we  must  include  Q  (x^+1<  r]c+i^ 
discontinuities . 


Step  2 :  Formulating  a  set  of  constrained  (in  x^+^)  JLQ  problems 

having  x- independent  form  transition  probabilities  and  one- 

piece  quadratic  costs;  one  problem  for  each  region  of 

values  in  the  partition  of  Step  1.  This  step  is  like 
step  2  in  Section  5.4. 

Step  3 :  Solving  the  constrained  subproblems  that  are  formulated 

in  Step  2.  These  problems  solutions  represent  the  optimal 


expected  costs-to-go  from  (x  ,r  =j)  if  x  is 

J\  J\T*  X 

constrained  to  be  in  one  of  the  specific  regions 
of  values  defined  in  Step  1. 

Step  4:  Comparing  the  constrained  costs.  The  optimal  expected 
cost  -to-go  (x^r  =j)  from  any  x^  value  is  the 
minimum  of  the  constrained  expected  costs-to-go  that 
are  obtained  in  Step  3.  This  minimization  involves  the 
comparison  of  piecewise-quadratic  functions  in  x^. 

The  implementation  of  this  step  in  the  algorithm  developed 
in  Section  8.5  is  more  complicated  than  for  the  JLQ 
problems  of  Part  III. 

We  will  describe  each  of  these  conceptual  steps  in  sequence  so  as  to 
demonstrate  the  validity  of  Proposition  8.1.  The  actual  solution 
algorithm  mixes  these  steps  and  uses  other  facts  (that  will  be 
developed)  to  solve  the  control  problem  efficiently  (i.e.,  with 
fewer  calculations) . 


Proof  Step  1:  For  each  form  j  €  M  we  construct  a  composite  partition 
of  the  real  line  (of  x^+1  values)  by  superimposing  the  grids  associated 
with  p(j,i:x),  Q(x,i)  and  \+1  ^xk+i,rjc+i=,i^ '  for  a11  1  e  Cj  * 

The  general  procedure  for  obtaining  the  composite  partitions  is 


as  follows : 


For  each  r  =j  6  M  the  real  line  can  be  divided  into  a  finite 
k  — 

number  of  intervals  of  x^+^  values  by  superimposing  the  grids 


{U1 (t) :  t=l, . . . ,j?l} 


{v. . (t) :  t=l, . . . ,V, .-1} 

J1  Jl 


for  each  i  S  C .  , 
j 


obtaining  the  composite  partition 


-  -  YiU(0,<  YiU(1)<  Yk+i<*i*i‘1,<  Yk+X+i>  =  “ 


of  uni.^  .a  grid  points. we  define 


=  the  (finite)  number  of  such  nonempty  xk+1 
intervals 


where  the  t  such  interval  is 


Ak.i(t)  ‘  {lW  Yw(t-U<  V  Yk+i(t)) 


t=1 . \+l‘l 

These  intervals  of  x,  ,  values  are  the  domains  of  the  individual 

k+1 

quadratic-in-x^+^  pieces  of  the  function 

Vx'vJv’1  -*&>  -  Vii"1  +  *  sL(t' 


for  x.  .  €  A.  .  (t) 
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region  of  x  values.  To  see  this, note  that  over  each  such 

region  ^(t),  Q<Vl'rk+l=i)  Vk+l(Xk+l'rk+l=i)  are  quadratic 

and  p(j,i;xk+1)  is  constant  in  x^+1»  ^or  aii  i  6  C  .  These  constrained 
problems  are: 


VWj|W  AJ+i(tl1  -  Wj|tl  = 


!\R(j)  +  xj  Q:(  t) 

+  Sj(t)Xk+i  +P  j(t) 

xtx  KTi  +  Vk+1 ^Xk+l'rk+l^ 


(8.42) 


\  R(j) 


^+lQl(t)+Xk+!Sl(t) 


W  ^+l(t) 


p(jfi»xk+l)  +  P  (t) 


+  Vi'Vi'11 


min  / 

^s.t.  K«0)  *  vMlVllVj) 

W  AL(l)  1 


(8.43) 


subject  to  (8.1)- (8.3)  for  each  t=l,  2, . . . 


Proof  Step  3 :  Solving  the  constrained  subproblems 


The  third  step  in  this  constructive  proof  of  Proposition  8.1  is  to 
solve  the  contrained  JLQ  problems  of  (8.42).  For  each  r^=j  €  M 
the  solutions  of  these  constrained  optimization  problems 

involve  optimal  expected  costs-to-go  that  are  piecewise-quadratic 
in  x,  with  two  or  three  parts: 


fVk,L(Vj)  if  a(3>x]cl  9J(t) 

WVjIVi<i(t»  =  Vk'Vj)  if  9^(t)<  a(j)x]c<  0^(t) 

lVk'R(xk'j)  ^  0j(t)-  a(j)Xk 


with  corresponding  optimal  control  laws 


if  a(j)xk<  aj(t) 


uk[xk'Vj‘Xk+ieAk+l(fc)]  j  \'U(xk,j)  if  9^(t)<  a(j)xk<  ©k  (t) 


uk,R(xk'j)  if  9k(t)<  a(j)xk  '  ( 


As  in  Chapter  5,  the  superscripts  L,  R  and  U  correspond, 
respectively,  to  driving  x^^  to  the  left  endpoint,  the  right 
endpoint ,  or  the  interiori  of  the  region 


=  (Y?  . ,  (t-1) ,  Y,j.,(t))  . 


'Where  constraint  (8.42)  is  inactive 


in  (8.44) -(8.45) . 


For  t=l  there  are  no  left  parts  and  uk 

i  t  R  t  R 

For  t  =  there  are  no  right  parts  v^'  and  u^'  .  If 


13  W 


=  2 


R(j) 

b2(j) 


+  *5Ui 


(t) 


<  0 


for  some  t,  then  the  inactive-constraint  control  ut,U  is  never 

k 

valid;  that  is,  for  this  t 


9,j(t)  =  ©J(t) 
k  k 


in  (8. 44) - (8 . 45) ,  and  therefore  V  (x  ,r  = j  j t )  has  only  two  pieces. 

JC  KL 

This  kind  of  subproblem  solution  in  (8.42)  is  the  result  of  a  suf¬ 
ficiently  concave-down  piece  of  the  x-cost  Q(x,r=j).  We  saw  an 
example  of  this  in  of  Example  8.4.  The  derivation  of  expres¬ 
sions  for  these  control  law  and  expected  cost  pieces  involves 
straightforward  (but  tedious)  algebraic  manipulations  that  are  des¬ 
cribed  in  Appendix  D.2.  Formulae  for  the  quantities  in  (8.44)- 
(8.45)  are  listed  for  reference  in  Appendix  D  .1. 

The  V^'L(x^,  j)  ,  V^'^x^j)  and  v^,R(x^,j)  in  (8.  44) -  (8. 45) 
sure  similar  to  those  of  Chapter  5.  in  particular,  when 

3^Vkt,U/(  3xfc+].)2  -  0  we  have  the  following: 

•  at  Xy.  =  0k  (t)/a(j)  the  values  and  slopes  of  vk'L(xk,j)  and 

t/U  , 

(xk»j)  are  the  same.  At  xk  =  0  £(t)/a(j),  the  values  and  slopes  of 


When  3 '  /(3xk+1)  -  0  we  have,  for  t=2, . . . ,  y  ^+1  -1: 


Vv'L  (x>' j) 


=Vk'R(V3> 


9  lMm  0  J(t) 

k 


a  ( j)  a  ( j ) 


9  J(t)  0  J(t) 

x,  =  k  =  k 
k 


a( j)  a ( j ) 


3V^,L(x  ,j) 


3xk 


R 


3Vk  (Xk'9) 


x  = 


9k(t). 


9j(t) 


a*k 


a(j)  a  ( j ) 


(8.46b) 


x  = 


e3(t) 

k 


9a(t) 


a ( j)  a( j) 


3  2V 

For  all  t=2 , .  . .  ,  rl  (regardless  of  the  value  of  K._._  J 

K+l  t  \  ' 


t,u 


k+l 


(8.47b) 


^k+l5 


t  R  t  L 

since  Vfc'  (x^,;])  V^'  (x^/j)  have  the  same  curvature  it  follows 


that: 


for  a(j)x^>  03(t) 


Vk'L(Vj)<  Vk,R(xk' j) 


(8.48) 


for  a(j)xk<  0^(t) 


Proof  Step  4:  Comparing  the  Constrained  Costs 
The  fourth  step  in  this  proof  of  Proposition  8.1  is  to  compare 
the  solutions  of  the  constrained  JLQ  problems  specified  by 

(8.42).  For  each  r^  =  j€M,  \  rk=9 )  at  each  x^  value  is  the 

smallest  of  the  constrained  costs  in  (8.43).  That  is. 


Vvr> 


min 


t-1,. 


k+l 


{VWj 


W 


Ak+l(t))}  •  (8'49) 


This  minimization  involves  the  comparison  of  piecewise-quadratic 


functions  in  x.  . 


In  principle  we  can  use  (8.49)  to  find  (x^, r^=j)  and 


u^tx^r  =j)  (that  is,  the  quantities  K^(t:j),  H^(t:j),  G^Jtcj), 
L^trj),  F^  (t :  j )  ,  (5^(t):  t=l, . . .  ,nv^  ( j ) -1}  and  in^lj)  as  in  (8.36)- 


(8.37)).  This  minimization  was  done  graphically  for  the  examples 
of  Section  8.3.  In  general,  we  must  accomplish  the  minimization  of 
(8.49)  by  finding  the  intersections  of  the  quadratic  functions 


I  V£'D(V3)'  v£'R(y3>-  vk’L(V3)'  vk'0‘vj\vK'E,vj1--.-. 

1  »3.,-l,L  *k+r1,a  ♦iLi"1'*  ♦L-1-1  *k+l"1' 


k+1 


(x  , j)  ,v 
k 


Kk+1  *'•'  'k+1  k+1 

(Vj),\  (V3,'vk  (V3>vk 


and  choosing  [x^,r  =j]  at  each  value  or  x^  to  be  the  one  having 


(8. 


the  lowest  value  there  (for  those  costs  that  are  valid  at  x,  ) .  Thus 

k 


V^[x^,rk=j]  is  piecewise-quadratic  in  x^  and  u^(x^,r^=j)  is  piecewise- 
linear,  as  claimed  in  (1)  of  Proposition  3.1.  The  verification  of  (2) 


in  the  proposition  is  straightforward,  given  our  requirement  that 


Q11  (!)>_  0  and  QJ  (UJ)>  0  in  (8.9) 


The  fact  that 


3vwj> 

- r -  is  either  continuous  or  decreases 

3xk 

discontinuously 

at  the  joining  points  {6^(1) , . . . , 6^ (m^ ( j ) -1) }  follows  directly  from 
the  comparison  in  (8.49);  a  particular  joining  point  6^  01)  can  arise 

iv 

in  two  ways: 

(1)  two  (or  more)  of  the  constrained  costs-to-go  in 
(8.49)  may  cross  at  <5?(&).  Since  V  (x  ,r  =j) 

K  K  K 

is  the  smallest  candidate  cost  at  each  x^  value, 
the  slope  of  V^Cx^r^j)  must  decrease  discontinuously 
at  such  a  6^ (Z) . 


(2)  <$J(£)  may  be  an  x  value  where  the  optimal  candidate 

Jc  K 

cost  in  (8, 49)  changes  from  one  of  its  parts 

(Vt,L,  vf'U,  V.t,R)  to  another, 
k  k  k 


3V 

When  k 


>  0  we  can  have 


(3xk+l} 

•  (^(i)  =  0^  (t)/a(j)  ; where  the  left  endpoint 

constraint  becomes  inactive  (i.e.,  V,  (x,  ,j|t) 

k  k  1 

changes  from  V^'L(x  ,  j)  to  V^,U(x  , j)) 

^  K  K  K 


=  0j^(t)/a(j);  where  the  right  endpoint 

constraint  becomes  inactive  (i.e.,  V,  (x,  , jit) 

k  k  1 

changes  from  v£'  (x^,j)  to  v£'R(x^,j). 


In  either  of  these  cases  the  slope  of  V  (x  ,r^=j)  is  continuous 

j 

at  S  (£)  . 

k  _ 


0  we  can  have 


Ox,..,) 


.  6^(£J  =  G^(t)/a(j)  *  9^(t)/a(j),  where  the  left 

endpoint  constraint  becomes  inactive  and  the  right 

endpoint  constraint  becomes  active  (i.e.,  Vk(x^,j|t) 
t  L  t  r 

changes  from  V^'  (x^j)  to  V  '  (x^,j)).  This  occurs 

t  L  t  R 

at  the  crossing  point  of  0^,j)  and  V^'  (x^.j). 
Consequently  the  slope  of  V  (x  , r  =j)  decreases 

K  K 

discontinuously  here. 

This  concludes  the  proof  of  the  one-stage  solution  given  by 


Proposition  8.1.  Certain  qualitative  properties  of  the  optimal 
controller  that  are  developed  later  in  this  chapter  will  be  used  to 
simplify  the  procedure  that  is  described  above. 


In  this  section  we  examine  several  combinatoric  and  qualitative 
issues  related  to  the  (off-line)  determination  of  the  optimal  control 
laws  and  costs  of  Proposition  8.1.  Aspects  of  the  problem  that  are 
addressed  here  include: 

•  the  nature  of  active  hedging;  examining  what  values 
of  an  optimal  controller  will  hedge  to  and  why,  and 
what  values  of  will  be  avoided  and  why 
(Corollary  8.3), 

•  determining  how  many  of  the  candidate  costs  (and 
control  laws)  must  actually  be  computed  and 
compared  (Proposition  8.2), 

•characterizing  the  number  of  pieces,  m^Cj)  of  the 
optimal  expected  costs  V^tx^r^-j)  and  control 

law  \(Wj)- 

The  topics  studied  here  are  useful  in  the  specification  of  an 
efficient  way  to  carry  out  the  algorithm  steps  that  is  indicated 
in  the  proof  of  Proposition  8.1. 

These  facts  will  be  established  as  we  pursue  the  following: 

(1)  First  we  show  that  many  of  the  candidate  costs 
in  (8.50)  cannot  be  optimal  (for  any  value) 

and  hence  they  need  not  be  computed  (Proposition  8.2). 


(2)  Next  we  show  that  each  candidate  cost  in  (8.50) 
can  be  optimal  over,  at  most,  a  single  interval 
of  x  values.  This  bounds  the  number  of  pieces 

rn^tj)  of  (x^»r  =j) .  (Proposition  8.4  and 

Corollary  8.5). 

(3)  We  then  describe  the  endpieces  (Proposition  8.6) 
of  the  optimal  JLQ  controller  for  these  problems. 

(4)  Finally,  we  use  these  results  to  devise  an 
algorithm  for  the  computation  of  the  optimal 
controller  in  Proposition  8.1  that  is  efficient 
in  the  sense  that  many  of  the  candidate  costs 
in  (8.49)  need  not  be  computed  and  compared. 

The  solution  algorithm  is  presented  in  flowchart  form  and  is  described 

in  detail.  It  is  basically  similar  to  the  solution  algorithm  of 

section  7.2  (for  the  problems  of  chapter  5) . 

The  following  proposition  eliminates  many  of  the  candidate  costs 
in  (8.50)  from  eligibility  for  the  optimal  cost. 


Proposition  8.2;  In  performing  the  minimization  in  (8.49),  the 
following  candidate  costs  of  (8.50)  need  not  the  examined: 

(i)  if 


b  (3) 


(8.51) 


and 


(8.52) 


and  (xk+1 1  rjc=j )  continuous,  at  y^+^t)  with 


2  ■sL(t+ii'4+i(t> 


(t+i) 


2^i(t»  iU<« 


■W*’ 


(8.53) 


then  we  need  not  examine 


fTt  r  R  <  *\  —  t+l/Xj^  .. 

\  <V3)  —  \  ‘V31  • 


(ii)  if 


b  (j) 


(8.54) 


then  we  need  not  examine 


t  ,u 


I  j 

(iii)  if  vk+1(xk+1!rk=^  is  discontinuous  at  y^+1(t)  with 


CaL(t)-ah+l(t+1)] 


then  we  need  not  examine  vk+^,L^xk'^ 


(8.55) 
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The  proof  of  this  proposition  is  in  Appendix  D. 3.  □ 

This  proposition  is  a  Generalization  of  Proposition  5.2.  For  the 
problems  of  Chapter  5,  (8. 51) - (8. 53)  are  alwavs  true  and  (8.54) 
never  occurs.  For  the  more  general  problems  of  section  8.2  we  must 
account  (in  Proposition  8.2)  for  additional  possibilities.  Consequently 
we  must  examine  a  different  (and  usuallv  more  numerous)  set  of  candidate 
costs  than  for  those  of  Chapter  5  in  the  minimuzation  of  (8.49). 

For  example,  if  either  (8.51)  or  (8.52)  does  not  held,  then  we 
must  examine  v£'R(x^,j)  =  V^+1,L^X^.'^  even  thouqh  vk+1  (xk+1!  rk  =  j) 

is  continuous  at  (t) .  In  examples  8.3  and  8.4  the  optimal  controller 

A  , 

hedged  to  continuous  points  of  V  (x  i r  ,  =  1)  for  x„  ,  intervals  over 

N  N1  N-l  N-l 

which  these  additional  eligible  candidates  were  optimal.  Hedging  to 
continuous  points  of  the  conditional  expected  cost-to-go  was  not  possi¬ 
ble  for  the  JLQ  problems  of  Part  III. 


An  illustration  of  Proposition  8.2  (ii)  appeared  in  example  8.4. 

3  U 

The  candidate  cost  V  (x%,  _  =  1)  was  shown  to  never  be  valid,  because 

N-l  N-l 

aV'u 

N-l  <  0. 

(3Vi> 

<  •R(1) 

This  second  derivative  is  negative  if  and  onlv  if  K  (3)  -  9  as  in 

N  b  (1)  , 


(8.54).  Thus  Proposition  8.2  (ii)  specifies  that  for  example  8.3  we  need 

3'u  3  ,L  3  ,R 

not  examine  V  but  we  do  have  to  examine  v  ,  and  V,,  ,  . 

N_]_  N- 1  N- 1 
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Our  requirements  on  Kt(i)  ,  K^,(n  )  Q.  (1)  and  Q  (n  •  )  in  (8.9) 

guarantee  that  (8.54)  does  .not  hold  for  t=l,  .  That  is,  we  must 

1  U  ^k+l'^ 

always  examine  V^'  and  Vk 

1,U  '4 

Therefore  the  endpiece  candidate  costs  (x^rj)  and  (x^,j) 
are  eligible  candidates  for  consideration. 

The  following  corollary  specifies  necessary  conditions  for  a  point 
x^+^  to  be  hedged  to. 


Corollary  8.3: 

If  the  optimal  controller  in  Proposition  8.1  hedges  from  (x  , r  =j) 

iv  K 

to  the  point  x,  =x  then  one  (or  more)  of  the  following  is  true: 
k+1 

(1)  x  is  a  discontinuous  point  of  the  conditional 


cost  'WvJvj)' 


(2)  x  is  a  boundary  (Y^+1 (t)  or  YjL, (t-1) )  of  an  in- 

j 

terval  A^+^(t)  oiaarwhich  the  conditional  cost 


Vi(Vi,rk*j>  has 


KjJ+1<t><  -R(j)/b2(j) 


(3)  x  »  is  a  boundary  of  intervals  A^(t)  and 

j  A 

A^(t+1)  where  r]c=^  continuous  and 


2Yx+i(t)  9Xi(t,-0Xi<t+i> 


(8.56) 


Proof :  Hedging- to-a-point  can  occur  only  to  finite  boundary 
points  of  the  x^+1  intervals  {A^  (t) :  t=l,...,^  };  that  is,  to 


an  element  of  the  set  {y.*1  ,  (t)  :  t=l,...,4)?  ,-l}.  When  the  optimal 

k+1  k+1 


controller  drives  x^+^  to  such  a  point  from  x^,  then  either 


t,  R 


t+l/L  , 


or  '  is  the  optimal  cost  from  that  x^.  Proposition  8.2 

excludes  many  of  these  constrained  candidate  costs  from  eligibility. 

t,R 

Corollary  8.3  lists  the  possible  ways  that  a  constrained  cost 
or  V£+1,L  associated  with  Y^+1(t)  can  be  eligible  .Corollary  8.3(1) 
occurs  when  either  Proposition  8.2(iii)  or  8.2(iv)  holds.  Here 

A 

we  hedge  to  the  low-cost  side  of  a  V^+1 (x^+1 | r^=j )  discontinuity. 
Corollary  8.3(2)  occurs  when  Proposition  8.2(ii)  holds,  When  (8.54) 
is  true,  V^'U  is  not  eligible  but  both  ,L  and  v^'R  are 
eligible  (unless  excluded  by  Proposition  8.2(iii)  or  (iv) ) . 

Corollary  8.3(3)  holds  when  (8.53)  of  Proposition  8.2(i)  is  not 
satisfied.  JJ 


Note  that  if  one  or  more  of  the  conditions  of  Corollary  8.3 
is  satisfied  for  some  x  =  y^+^(t),  we  are  not  yoaranteed  that  the 
optimal  controller  hedges  to  that  x;  the  associated  constrained 
costs  V*'R  and  V^+1'L  need  not  be  optimal  in  (8.49). 

From  Proposition  8.2  we  know  that  the  mapping 


\t—  Vi(Wj) 

need  not  be  one-to-one,  in  that  hedging  to  points  may  occur. 
Proposition  5.3,  which  lists  a  number  of  general  qualitative  pro¬ 
perties  of  the  optimal  controller,  applies  for  the  JLPQ  problems 
of  this  chapter.  We  repeat  this  proposition  here.  The  proof  of 
this  result  for  the  JLPQ  case  is  somewhat  different  in  detail  from 


the  JLQ  case. 
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(i)  u]c^xfc,r]<a!^  increase  discontinuously  at  6  when 
/  .1  \ 

— t4t->0  (and  decreases  discontinuously  at  5 
a  (3 )  - 

when  b  1  <  0) 
a(}) 


(ii)  the  mapping  x]cv->xjc+1^x]c'rjc=^  increases 

discontinuously  at  5  when  a(j)>  0)  (and  decreases 
discontinuously  at  5  when  a(j)<  0). 

(4)  The  mapping 

itk—»  wvv3> 

has  the  following  properties: 

(i)  the  mapping  is  monotonely  nondecreasing  if 
a (j) >  0  (and  monotonely  nonincreasing  if 
a(j)<  0)  for  each  j€M 


(ii)  it  consists  of  m^j)  line  segments: 

•  one  line  segment  with  positive  slope  if 
a(j)>  0  (negative  slope  if  a(j)<  0)  for 
each  region  where  an  "unconstrained  cost" 

V^'U(x^,rk=l)  is  optimal  : 

VVVj)  *  Vk'U(W3) 


x  — 1  v  - 

^+1  [r(  j )  +b2  ( j )  K^+1  (t)  J  * 


b2»)Sil+i(t) 

2[R(j)+b2(j)K^+1(t)] 


(8.59) 
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.  a  constant  line  segment  for  each  region 
where  there  is  active  hedging-to-a-point: 

*k+l  =  Yk+l(t)  ,ij»k+1>. 

(iii)  there  are  regions  of  x^+^  avoidance  associated 
with  (and  only  with)  each  x^=<5  value  where 
the  slope  of  v^x^r  *j)  decreases  discontinuously. 

(5)  Each  candidate  linear  control  law  associated  with  the  costs 
listed  in  (8.50)  can  be  optimal  over,  at  most,  a  single 

interval  of  x^  values. 

c 

Proof :  Items  ( (1)  —  (3)  and  are  proven  exactly  as  for  Proposition 
5.3  in  Appendix  C-4.  For  item  (4):  From  3(ii)  we  have  that  the 
mapping 

’St— »  VilWj) 

increases  discontinuously  at  joining  points  where  Vvv3)  is 
not  differentiable,  and  from  (2)  the  mapping  is  continuous  at 
other  joining  points. 

Now  between  joining  points,  if  the  optimal  cost  corresponds 
to  hedging-to-a-point  then  clearly  the  mapping  is  constant.  If 
the  optimal  cost  does  not  correspond  to  hedging-to-a-point,  then 
in  such  a  region 

vyvj)  *  vk,0(iv^)  -  +  v4!t>  *  °i(t> 


I 


for  some  te{l, . . .  Thus  from  (8.58) 


b2(j) 


3WV1} 


WWj)  55  a(j)\  -  STjTRTj )  ' 


hence 


!Vi  ... 

3^  '  3  2a(j ) R( j) 


b2H)  3  vic1Vrk*i) 

(3V2 


=  a  ( j )  - 


2  j 


b2(j) 

2a(j)R(j) 


2s£(t) 


*k+l 


If  R( j )  +  b  k£  (t)-0  then  Kj(t)-0  ,  so  —  =a(j)>0 


If  R ( j )  +  b2K^+1(t)^0  then 


Jxk+1 

3^  =  a(:)  *  2a ( j } R( j ) 


b2(i) 

R(j)+b2(j)Kj+1(t) 


a(j)R(j) 


R(j)+b2(j)K^+1(t) 


3Vi 


If  K£+^(t)>  -  R(j)/b  (j)  then  — g -  >  0  in  (8.60)  if 

K 


6 


a(j)>  0  / and  9xk+l  <  0  j.  But  for  V^'U(xk,rk=j) 

9xk  I 
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(8.60) 


* 


:• 

J 

•j 

I 


* 


3 


to  be  optimal  over  some  interval  of  x^  values,  we  must  have 
K^+1(t)>  -  R(j)/b2(j) 

by  Proposition  8.2.  Thus  we  have  4  (i ) ,  (ii) .  The  remainder  of 
the  Proof  of  Proposition  8.4  follows  the  proof  of  Proposition  5.3 
in  Appendix  C.4,  exactly. 

Proposition  8.2  restricts  the  number  of  candidate  costs  that 
must  be  considered  in  (8.49)  and  fact  (5)  of  Proposition  8.4  says 
that  each  candidate  can  be  optimal  over  at  most  one  x^  interval. 
Thus  we  immediately  have  from  (8.50): 


Corollary  8.5: 

The  number  of  pieces  of  the  optimal  expected  costs-to-go 
,rk=j)  and  their  associated  control  laws  are  bounded  above  by 

Vj)i  *i.i  • 2  0 

A  weaker  bound  which  follows  from  (8.41)  is 


V*k 


\(i)<  3  [  [vji+^+ink+i(i)1 

1€cj 


-  8 


The  bound  on  the  growth  of  the  number  of  pieces,  ra^(j) , 

(as  (k-N)  increases)  in  Corollary  8.5  is  much  larger  than  for 
the  JLQ  problems  (in  Corollary  5.4).  Corollary  8.5  suggests  that 


l 


the  number  of  pieces  in  each  optimal  expected  cost  (x^»r  ®j)  may 
grow  geometrically  rather  them  linearly  (because  of  the  factor 


3mk+1(i)  in  (8.62)). 

We  examine  next  the  behavior  of  the  optimal  JLPQ  controller 
when  x  is  far  from  zero.  As  in  the  JLQ  controller,  over  their  end- 

,r^=j)  can  be  computed  from  sets  of 
recursive  difference  equations.  These  equations  correspond  to  the 
solutions  of  x- independent  form  probability,  single-piece  x  cost 
JLQ  problems  (as  in  Chapter  3) . 

For  finite  time  horizon  problems,  if  x^  is  negative  enough  or 
positive  enough  the  optimal  strategy  will  be  to  keep  x  in  the  same 
extreme  piece  of  the  form  transition  probabilities  p(j,i:x)  and 
x-costs  Q(x;&),  QT(x,j),  for  all  if  from  each  j€M,  for  all  future 
times. 


pieces  \ ^ ' rk33 )  and 


Proposition  8.6;  Endpieces 

Consider  the  JLQ  problem  of  Proposition  8.1. 

(1)  For  x^  £  5^(1),  the  optimal  control  laws  and  expected 
costs-to-go  are 

VWJ>  *  vi'D‘VS’ 


vj^Oy:)  -  x^lj)  .  x^lj)  *  <**<1'  (8-63) 


(8.64) 


Vvvj)  ■  \'U(vj) 


A  Le  .  . .  _  Le  ,  . ,  ,  Le  ,  . , 

■  \  ‘v3’  -  -h  *  pk 


(2)  For  >_  53  (m.  (j ) -1) ,  the  optljnal  expected  costs-to-go  and 


control  laws  are 


VWJ)  '  Vk  <Vj) 


A  Re  .  . ,  2i„Re  ...  ,  „Ke  ...  *'=  /  - n 

=  Vk  (V3)  =  *k\  (3)  +  \Hk  (:,)  +G,  (3) 


2,.Re , . 


Re,..  (8.65) 


*iUl,U 


VWj)  =  \  '  <Vj) 


A  Re ,  . .  _  Re , . .  ,  _Re , . . 

■  \  ‘v31  ■  Wxk  *  Fk  (3) 


(8.66! 


(3)  The  parameters  in  (8. 63) -(8. 66)  are  computed  recursively, 
backwards  in  time  from  N  by 


^  3  R(j)+b2(j)K^L(j) 


(8.67) 


Le  a(j)R(j)H?®  (j) 

“k  (j)  =  - 2  ■  7£T~ 

R(j)+b^(j)xJJ  (j) 


(8.68) 


Le...  *Le  . ..  b2(3»^l(3>l2 

Gk  (j)  "  Gk+1(3) - 2 - ^ - 

4[R(j)+bZ(j)K^1(j)] 


(8.69) 
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u  \  (1) [k£®  (i)+Q: (1)] 

■i  a/%  iC+*L 


Zj  Xj;  (1)  [S,w,  (U+p1!!)] 


ji'  ' 1  k+1 


a2(j)R(j)K^®1(j) 

R(j)+b2(j)K^1(j) 


a(j)R(j)H^1(j) 

R(j)+b2(j)K^1(j) 


.  2 , . .  r~Re  . . . .  2 
b  (3) IH.  (3)] 


4[R(j)+b-(j)K^~1(j)] 


VV  >’ 


VV  iC'iHsSi?)] 


■.  ■■  1 


Proof ;  This  proposition  is  essentially  the  same  as  Proposition  6.1, 
except  for  the  parameters  of  the  extreme  x-cost  pieces  in  (8.70)- 
(8.72),  (8 . 76) - (8 . 78)  and  terminal  conditions  (8 .79) - (8 .84) . 

In  these  extreme  pieces  since  we  have  assumed  (in  (8.9))  that 


Qj(l)>  0 


(vP )  >  0 


Kj(l>>  0 


for  all  j€M,  we  will  have 


kt(^)>  0 


4+1(l)>  and 


b2(j) 


i  u  K+iiU 

Thus  by  Proposition  8.2,  V  '  (x  ,  j)  and  V  (x  , j)  will  be 

valid  subproblem  solutions  in  (8.49)  for  some  range  of  x  values. 
Therefore  we  cam  apply  the  arguments  of  Appendix  C. 5  directly  to 
establish  Proposition  8.6.  j—j 

The  conditions  for  the  existence  of  steady- state  endpiece  cost 
parameters  amd  control  law  parameters  are  the  same  as  in  Proposition  6.2, 
amd  will  not  be  repeated  here. 

We  have  identified  some  basic  qualitative  properties  of  the 
JLPQ  problem  that  cam  be  used  to  reduce  the  combinatorics  involved 
in  the  "brute-force"  solution  of  the  one- stage  problem  that  was 
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presented  in  the  proof  of  Proposition  8.1.  We  now  present  a 

solution  algorithm  that  exploits  these  properties,  enabling  us 
to  solve  the  general  JLPQ  problem  (8.1)- (8. 10)  efficiently. 

This  algorithm  is  based  upon  the  application  of  the  one-stage 
solution  of  Proposition  8.1  recursively,  backwards  in  time,  for  each, 
jSM  that  the  system  can  take.  The  basic  idea  of  the  JLPQ  solution 
algorithms  is  the  same  as  in  the  JLQ  solution  algorithm  of  Section  7.2 
For  each  form  jeM  at  time  k,  we  can  compute  (x^r  =j) 
u^tx^r  aj)  one  piece  at  a  time,  sweeping  from  left  to  right  along 
the  axis  of  afjjx^  values. 

An  overview  of  the  solution  algorithm  is  shown  in  Figure  8.5. 

The  algorithm  is  initialized  with  the  terminal  time  (k=N)  cost 

parameter  (block  2) .  Then  for  successively  decreasing  times  through 

k=k  (block  13),  the  one-stage  solution  of  Proposition  8.1  is  obtained 
o 

for  each  form  j€M  (block  10).  Figure  8.5  differs  substantially  from 
the  analogous  flowchart  (figure  7.1)  of  Section  7.2  only  in  the 
initialization  block  (block  1) . 

In  the  following  discussions  we  refer  to  the  algorithm  flowchart 
shown  in  Figures  8.5-8.11.  All  of  the  steps  indicated  in  this  flow¬ 
chart  constitute  one  iteration  of  block  10  in  Figure  8.5.  That  is, 
they  determine  the  one-stage  JLPQ  solution  that  is  specified  by 
Proposition  8.1  for  some  time  stage  k  and  form  j.  For  the  reader's 
convenience,  a  table  of  block  number  locations  and  entry  points  is 


given  in  Table  8.5. 


Figure  Number  Block  Numbers  Entry  Points 


Exit  Points 


Start  (blk.  1) 


Stop  (blk.  14) 


16-26 


27-51 


52-63 


64-68 


69-76 


77-86 


from  block  10  Q  (blk.  22) 


(J)  (blk.  27) 


©  (blk.  64) 


®  (blk.  69) 
0  (blk.  76) 

0  (blk.  77) 
(?)  (blk.  80) 


<D  (blk.  46,48) 
(p  (blk.  31) 


©  (blk.  52)  Q)  (blk.  63) 


0  (blk.  68) 
(2>  (blk.  68) 

(D  (blk.  70) 


©  (blk.  84) 
0  (blk.  86) 


return  to  block  10 
from  blk.  82) 


TABLE  8.5;  Block  Number  Locations  and  entry  points  for 
JLPQ  solution  algorithm  flowchart. 
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e  8.5  Algorithm  Overview 
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A  macroscopic  overview  of  the  algorithm  specified  by  this 
flowchart  is  as  follows: 

1.  The  algorithms  is  first  initialized  (in  block  1)  at 
time  N  with  the  terminal  x-cost  QT(x,j)  for  each 
j€M. 

2.  The  determination  of  the  optimal  controller  at  time 
k  for  a  fixed  j  value  constitutes  one  iteration  in 
block  10. 

3.  The  computations  of  block  9  begin  in  block  26  with 
the  determination  of  the  composite  x  partition 

Jv  i  X 

(block  16) .  This  partition  is  obtained  from  the 

joining  points  of  v,  ,  (x,  ,  ,r  =i)  for  all  iec. 

k+1  k+1  k+1  3 

that  were  computed  in  the  previous  time  stage,  and 

from  known  parameters  of  the  problem. 

4.  For  symmetric  problems  about  zero  we  only  compute 
this  grid  for  x^+^£  0.  This  is  accomplished  by 

blocks  17,23,24  and  25.  If  x^+1=0  is  agrid  point 
we  must  be  sure  to  include  it  in  later  calculations 
(blocks  20,21). 

5.  The  next  task  is  to  determine  which  candidate  cost-to-go 
functions  are  eligible  for  optimality  with  respect  to 
Proposition  8.2,  and  to  compute  the  parameters  for  these 
eligible  functions.  This  is  done  in  Figure  8.7.  We 
begin  by  computing  the  conditional  cost  parameters  for 

Vi'Vil  V11  £or  a11  ‘-1 . 

We  also  calculate  K,  here. 


in  block  27. 


By  Proposition  6.1,  the  endpiece  cost  (x^,j) 

is  always  an  eligible  candidate.  It  is  computed 

in  block  28.  If  i|^+1=l  then  we  are  done  (block  31) . 

If  not,  then  the  partition  piece  counter  is 

set  to  t=l  in  block  32,  and  the  variable  Lside  is 
set  to  "yes"  in  block  33.  The  variable  Lside  answers 
the  question: 

"is  £j+1(t)>  -R(j)/b2(j)?" 

That  is,  is  (8.51)  satisfied  on  the 
left  side  of 

In  blocks  29,35,37  we  determine  if  (x^+^ I  ^  =j) 
has  a  discontinuity  at  x,+^  =  Y^+^(t).  If  there 
is  a  discontinuity , then  either  V^'R  or  V^+1,L  is 

JC  K 

computed  (as  specified  by  (iii)-(iv)  of  Proposition 
8.2)  in  block  34  or  36. 

If  Y^+^(t)  is  not  a  discontinuous  point  of 

Vk+l(xk+llvj)  t^ien  we  enter  block  38.  If 
Lside=no  then  either  t=l  or 


*+1  b2(j)  • 

That  is, (8.51)  of  Proposition  8.2(i)  is  not  satisfied 
so  we  must  compute  v£'R(xk,j)  471,3  Vk+1,L^xk'  ^  '  in 
block  45. 


9. 


If  Lside=yes  In  block  38  then  we  check  condition  (8.53) 
of  Proposition  8.2(i)  in  block  40,  and  compute 


Vt,R(x,  ,j)  and  ,j)  if  required  (in  block  45). 

3c  K  1c  jC 

10.  The  x]c+1  partition  piece  counter,  t,  is  incremented 

in  block  39.  If  t®^?  ,  (block  41)  and  the  problem  is 

k+1 

not  symmetric  (block  43)  then  we  compute 
^k+l  0 

'  (in  block  46)  which,  by  Proposition  8.6,  is 
an  eligible  cost  candidate. 

11.  If  t=^+1  (block  41)  and  the  problem  i£  symmetric 

,U 

yk+l 

(block  43)then  V  is  not  an  endpiece  (because  of 

t  R  fc+l  T. 

block  24).  Therefore  we  compute  V^'  =  ' 

Wp  ,u 

vk+l 

(in  block  44)  and  test  to  see  if  V  should  also 

be  calculated  (in  block  47),  according  to  (8.54)  of 
Proposition  8.2 (ii).  If  it  should,  we  pass  from 
block  47  to  50  to  46.  If  not,  we  exit  via  block  48. 

12.  If  t  <  ^+1  in  block  41  then  we  assign  the  Rside  value 


to  Lside  (in  block  42)  and  then  test  (8.54)  of 
Proposition  8.2(ii)  (in  block  47)  to  see  if  V 


t,U 


should  be  calculated.  If  it  should,  we  set  Rside  to 
yes  (in  block  51)  and  perform  the  calculations  in 
block  30.  If  not,  we  set  Rside  to  no  (in  block  49) 

.  .  „t,R  „t+l,L 

and  compute 


(in  block  45) 


Upon  leaving  figure  8.7,  we  have  calculated  all 
of  the  cost  parameters  f  or  eligible  candidate 
cost  functions.  Figure  8.7  is  substantially  dif¬ 
ferent  from  figure  7.3  of  the  algorithms  of  section 
7.2,  due  to  the  major  differences  between 
Propositions  5.2  and  8.7. 

We  next  prepare  for  the  rightward  sweep  alona  the 
a(j)x^  axis  by  obtaining  in  Figure  8.8  the 

partition  of  the  real  line  (of  a(j)x^  values)  that 

is  caused  by  the  points  {9^{t)  ,  0^(t-l)  s  t=2,..,i^+1} 

If  the  problem  is  symmetric  we  compute  as 

well,  in  block  57.  In  block  63  we  obtain  the  grid 
ordering  required  for  the  righthand  sweep. 
Initialization  of  the  righthand  sweep  is  completed  in 
Figure  8.9,  where  the  endpiece  result  of  Proposition 
8.6  is  applied.  Figures  8.8  and  8.9  are  more 
complicated  than  figure  7.9  in  the  section  7.2 
algorithms,  due  to  the  different  9  and  0  computations 

Al  2 

that  arise  when  K^+^(t)<  -R(j)/b  (j). 

Finally,  the  algorithm  performs  the  minimization  in 
(8.49)  over  each  interval  of  a(j)x^  values  in  the 

9-0  partition,  starting  on  the  left.  This  task, 
shown  in  Figures  8.10-8.11,  is  identical  to  the  steps 
in  Figures  7. 5-7. 6  in  the  section  7.2  algorithm, 
except  for  blocks  80  and  81.  If  x^+^=0  was  a  9ri^ 

point  of  the  x^+1  partition  (that  is,  if  zflag=yes) 
and  the  problem  is  symmetric,  then  we  have  5^(m+l)=0. 


This  completes  the  derivation  of  a  solution  algorithm  that 


computes,  off  line,  the  optimal  control  laws  and  expected  cost- 
to-go  parameters  for  the  general  class  of  finite  time-horizon 
JLPQ  problems  formulated  in  Section  8.2. 

In  the  next  section  we  conclude  this  chapter  with  the  application 
of  this  algorithm  to  am  example  problem.  ' 

8.6  Using  the  JLPQ  Solution  Algorithm 

In  this  section  we  will  use  the  algorithm  of  Figures  8.5- 
8.11  to  solve  the  jump  linear  piecewise-quadratic  control 
problem  of  Example  8.2  for  another  time  stage. 

Example  8.5:  (Example  8.2  at  k=N-?) : 

Recall  that  the  x-cost  Q(x^#rk_^)  for  fchis  problem  has 
three  constant  pieces  (specified  by  (8.21)  amd  shown  in  Figure 
8.2(a)).  The  optimal  controller  parameters  at  time  k=N-l  are 
listed  in  Ta&le  8.2  auid  cure  shown  in  Figure  8.4. 

We  apply  the  solution  algorithm  of  Figures  8.5-8.11  at  time 
k-N-2,  for  j=l : 

1.  Obtaining  the  composite  x„  ,  partition  in  block  26 
we  have 


1 


(symf lag=yes) 


Vi'°> 


=  -23.41 3 


Vi«> 


VilJI 


7»-l(4> 


(zf lag=no) 


2.  Computing  in  block  27: 


Vi=1 


Vi!1)a0 

5n-i(1)=0 

a3;  -  (1)  =968.  75 

N-l 

k3;  ,  (2) =.25 

N— 1 

S  .  (2)  =.25 

N-l 

G1,  ,  (2)  =837. 5625 

N-  X 

sLl13’--75 

G3;  .(3)  =  512.6875 

^-1(4>=° 

G3;  .  (4)  =437. 5 

N— 1 

In  block  28:  V~'“  =  968.75 


In  blocks  29— >35 — >37: 


G*  _  (1,1)  =  151.92  =  S1,  .(1,2) 

N— £  N- 2. 


In  block  40: 


=  -5.853  < -.125  = 


«V-1U)-  N-I 
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In  block  39,  t=4=^  and  symflag=yes  in  block  43, 

N-l 

so  we  compute 


.,4,R 


N-2 

in  block  44  and,  since 


=  V2  +  437‘5 


(4)  »  0  >  =  -1  , 

N_1  b2(l) 


we  compute 

V4'?  =  968.75 
N-l 

in  block  46. 

We  have  now  computed  the  eligible  candidate  costs-to-go 
(according  to  Proposition  8.2): 


1,U  2,U  3,L  3,0 

VN-2'  VN-2'  \-2'  VN-2'  V 


We  compute  the  grid  of  9-0 


3,R  4,0  „4,R 

N-2'  N-2'  N-2  • 

values  in  Figure  8.8: 


0N-2(1) 


-.23.413 


eiU,2> 


O31  ■ 


v2(3)  - 


N-2  ^  - 


-29.141 

-1.125 

-1.375 

-.5 

.0 


(block  52) 


(block  59) 


(block  57) 


91.  „(4) 


-.5 


(block  56) 


Ordering  these  grid  points  as  specified  in  block  63: 


£  ?(2)<  £  ^(i)<  9i  ^(3)<  £  o(2)<  £  ^(4)  =  £  o(3 

N-2  N-2  N-2  N-2  N-2  N-2 


In  blocks  65-*  67-*  68: 


V2(1:1) 


HN-2(l!l) 


0 

0 


-  LN-2(1:1) 


=  v2(1:1) 


GN-2(1:1) 


=  968.75 


m=l 


We  begin  the  search  along  the  9-0  partition  of  a(j)x^ 
values  in  the  interval 


-29.141)  =  (-«,  9  (2) )  . 

N-2 


The  initial  list  of  eligible  costs  is 


V1'!?  (prevailing  cost)  and 
N- 2  N—  2 


In  block  69  we  find  that  the  leftmost  intersection  of 


V^2  “d  VN-2  U  “ 


Vj  ■  -24-04 


In  blocks  70 -*  71-^77—^79: 

we  move  into  the  next  partition  piece 


(9*  o^2>'0i  o^>>  =  <-29.141,  -23.413) 
N-2  N-2 


1,U  3,L 

Here  the  prevailing  cost  VN_2  an<^  Vn-2  are  t*ie 


eligible  and  valid  candidate  costs. 


Since  the  prevailing  cost  V  _  is  still  valid  we  go 

N-2 

to  block  69  (Figure  8.10).  The  intersection  of 

is  at  x  ,  _  =  -24.04,  so  we  go  to  block  72: 

N-2  N-2  ’ 


(1)  =  -24.04 


3,L 

V  .  now  prevailing 
N— 2 


Setting  m=2  (block  76)  we  have  (by  block  75) : 


KN-2(25l)  =  1  "  W2:1) 

HN-2(2ll)  =  2 

G,  _ (2:1)  =  438.6875 
N~z 

V2(2:1)  =  -1 

In  block  73  we  remove  V*'!?  and  V2'!?  from  the  eligibility 

N-2  N-2 

list.  The  only  currently  eligib  valid  cost  is  the 

prevailing  cost  V3'^,  so  no  new  intersection  need  to  be 
N— 2 

found  in  block  69. 


In  block  79  we  move  right ward  into 

(ei  0(i),  ei  -on  =  (-23.413,  -1.375) 

N~Z  N" Z 


The  list  of  valid  eligible  costs  still  contains  only 

v3,L. 

In  blocks  86-^83-»  7  0-^77  -»79— >83 : 


we  move  righthand  into 

(61  _ (3) .01  .(2))  =  (-1.375,  -1.125)  . 


V  '  ceases  to  be  valid  and  is  replaced  by  V  '  (which 
N—  2  N—  2 

is  now  the  only  eligible  valid  cost) . 

In  86-*85  -»84-*76-*75: 

.(2)  =  -1-375 

N“2 

and 

3  U 

V  '  is  the  new  prevailing  cost  (block  49). 

N—  2 

m=3 


KN-2(3;1) 

*  -4286 

"  LN-2(3:1) 

V2(3:1) 

=  .4286 

GN-2(3!l) 

=  437.607 

FN-2(3s1) 

-  -.2143 

3  L 

We  remove  V  from  the  eligibility  list  (block  73) 

N— 2 

3,U 

and  since  only  VN_2  is  valid  and  eligible,  there  are 
no  new  intersections  to  compute  in  block  69. 

In  blocks  70  ~»77  -*79  -.83  -*86  -*69 : 

<2<2)'  9l2{4)  =  0N-2(3))  *  (-1*125'  -*5)  • 

The  eligible  valid  costs  still  include  only  the 

prevailing  cost  V3,G  so  no  new  intersections  are 
N—2 


computed  in  block  69. 


27.  In  blocks  70-»77-*79-*83  -»86  ->  85-*  84 


we  move  into  the  interval 

(9N-2(4)  =  0N-2(3)'  0N-2(4))  =  (-'5'  0)* 

We  have  moved  past  both  0^  _  (4)  and  0?"  _  (3)  ,  so  the 

N—  2.  N—  2. 

4  U 

list  of  valid  eligible  candidates  becomes  v  ' 

N-2 

3,U 

only-  Thus  the  prevailing  cost  V ,  _  is  no  longer 

N-2 

valid.  We  set  6^"  „(3)  =  -.5  and  the  new  prevailing 
N— 2 

4,U 

cost  is  VXT  -  . 

N-2 


28.  In  blocks  76-* 77: 

m=4 

V2(4:1)  =  °  -  LN-2(4:1) 

HN-2(4s1)  =  °  =  FN-2(4:1) 

G  _ (4:1)  =  437.5 
N-2 

29.  In  blocks  73-*  69  -*70  -*77  -*78 : 

4  U 

Only  VN^  is  eligible,  so  there  are  no  intersections 

to  compute.  We  are  in  the  last  partition  interval  and 
a(l)>  0. 

30.  The  problem  is  symmetric  but  zflag=no,  so  this  k=N-2 
iteration  for  j=l  is  completed. 


Collecting  the  results  of  the  above  steps,  we  have  the  optimal 

controller  from  (x„  „,r  =1)  in  example  8. 2=8. 5,  as  listed  in 

N-2  N-2 

Table  8.6  below,  and  shown  in  Figure  8.12. 
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Comparing  these  results  at  k=N-2  with  the  k=(N-l)  results  of  Table 


8.2  and  Figure  8.2  we  see  that: 


1.  For  small  |x^|  (|x^  |<.5)  the  optimal  controller  spends  no 

control  energy  to  move  the  x-process,  since  it  is  already 
in  the  Q(x,r=l)=0  piece.  This  is  true  at  k=N-l  and  k=N-2 
( and  at  all  other  times  for  this  example) . 


2.  For 


large  |x.  |  (|x  . |  >23.413  and  [x  |>  24.04)  the  optimal 


control  is  also  zero. 


3.  In  example  8.5,  for  1.375  <  |x  2|<  24.04 

±he  optimal  controller  hedges  to  x„  =  -1+  or  1  .  These 

N-l 

^  i 

are  the  low-cost  sides  of  V  .  (xlT  ,  r  .=1)  discontinuities. 

N-l  N— 1  N-2 

At  time  k=(N-l)  we  did  not  hedge  to  these  discontinuities 
for  any  xN_r 

Note  that  we  not  only  hedge  to  x„  ,=-l+  or  x„  ,=1  but,  by 

N—  1  N— 1 

Figure  8.2(d),  we  will  also  hedge  at  the  following  time 

step  to  x,  =  -.5+  or  .5  .  That  is,  at  time  N-2  the  controller 
N 

actively  hedges  to  place  the  x  process  in  the  advantageous 
p(l,2:x)  piece.  Then  at  time  N-l  the  controller  actively  hedges 
to  place  the  x  process  in  the  advantageous  Q(x,r=l)  piece 
as  well. 

4.  In  example  8.5,  for 

•5  <|xJ<  1.375, 


we  choose  control  u  _  (x^T  ^,1)  which  results  in 

N-2  N-2 


.5  <|xN  ^|<  1.  Then  at  time  k=(N-l),  the  optimal  controller 

hedges  to  the  joining  points  xN=+. 5  of  Q(x,r=l).  Here  the 

optimal  controller  doesn't  have  to  hedge  to-a-point  with 

u  _  to  place  x  ,  in  the  advantageous  probability  piece 
N—  2  N— 1 

(lx.,  ,1  <  1)  ,  but  it  does  hedge-to-a-point  to  get  x  into 
N-l  N 

the  preferred  Q(x,r=l)  piece  (|x  |  <-5). 


8.7  Summary 

In  this  chapter  we  have  extended  the  kinds  of  x-operating  costs 
that  cam  be  incorporated  in  the  formulation  and  solution  of  control 
problems  for  jump  linear  systems.  We  have  developed  a  solution 
algorithm  that  determines  the  optimal  controller  for  perfectly  ob¬ 
served,  noiseless,  scalar  jump  linear  systems  where  the  form 
transition  probabilities  are  piecewise-constant  in  x  and  the 
x-operating  and  terminal  costs  are  piecewise-quadratic.  These  costs 
may  contain  discontinuities  and  they  may  be  concave  over  any  but 
their  extreme  pieces. 

The  qualitative  results  and  solution  algorithm  for  the  JLPQ 


problem  that  has  been  developed  here  provides  a  basis  for  the 


approximate  solution  of  scalar  jump  linear  control  problems  with 
quadratic  control  operating  costs  and 

.  x-operating  costs  Qtx^/r  ) 

•  x-terminal  costs  o  (x  ,r  ) 

N  N 

.  form  transition  probabilities 
.  input  noise  densities 

that  are  piecewise  convex  and  concave.  This  is  the  topic  of  the  next 
chapter. 
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9. 


CONTROL  OF  JUMP  LINEAR  SYSTEMS  WITH  ADDITIVE  INPUT  NOISE 


9. 1  Introduction 

In  this  chapter  we  extend  the  solution  methodology  of  chapters  5-8 
to  address  a  larger  class  of  scalar  jump  linear  control  problems, 
possessing  additive  input  noise  and  a  more  general  class  of  x-dependent 
form  transition  probabilities,  x-operating  costs  and  x-terminal  costs. 
Specifically  we  consider  scalar  jump  linear  control  problems  with 
quadratic  control  penalties  and 

•  input  noise  densities  that  are  twice  continuously  differentiable 
except  at  a  finite  number  of  points, 

•  x-operating  costs  Q(x,  r  ),  x-terminal  costs  QT(x,r  )  and 
form  transition  probabilities  p(i,j;x)  that  are  twice 
continuously  differentiable  in  x,  except  at  a  finite  number  of 
points;  they  consist  of  a  finite  number  of  convex  or  concave 
(in  x)  pieces. 

We  call  this  the  jump  linear  piecewise  convex  (JLPC)  control  problem. 

Our  study  of  this  class  of  problems  is  motivated  by  a  desire  to 
make  the  solution  approach  of  chapters  5-7  applicable  to  more  realistic 
control  problems.  The  discussion  in  this  chapter  builds  directly  upon 
the  JLPQ  problem  formulation  and  solution  of  chapter  8.  In  turn,  the 
results  of  this  chapter  provide  a  basis  for  the  study  of  jump  linear 
control  problems  possessing  n-dimensional  state  process  and  control- 
dependent  form  transition  probabilities,  in  chapter  10. 

The  major  extension  of  this  chapter  is  the  inclusion  of  additive 
input  noise  in  the  x-process  dynamics.  As  we  indicated  in  earlier 
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chapters,  we  can  approximate  general  x-dependent  form  transition  pro¬ 
babilities  in  piecewise-constant  way,  and  general  x-operating  and  terminal 
costs  by  piecewise-quadrat  ic  functions.  We  cannot  ,  however,  reasonably 
approximate  the  behavior  of  jump  linear  controllers  subject  to  additive 
noise  by  noiseless  JLQ  or  JLPQ  controllers.  As  we  shall  see  in  this 
chapter,  additive  input  noise  profoundly  changes  the  nature  of  the  optimal 
controller.  It  is  not  possible,  in  general,  to  use  the  control  input  to 
drive  the  x  process  into  a  specified  interval  of  values  (or  to  a  boundarv 
of  such  an  interval)  with  certainty ,  because  of  the  noise.  Consequently, 
we  cannot  solve  a  noisy  JLQ  or  JLPQ  problem  by  comparing  the  solutions 
of  constrained-in-x  subproblems . 

For  JLPQ  problems  like  those  in  chapter  8,  the  presence  of  additive 
input  noise  leads  to  the  loss  (in  general)  of  the  piecewise- cjuadratic 
structure  of  the  optimal  controllers,  due  to  the  "blurring"  effects  of 
the  noise.  If  the  noise  density  has  a  piecewise  structure  (that  is, 
if  it  is  twice  differentiable  except  at  a  finite  number  of  points 
(the  piece  boundaries)),  then  the  optimal  controller's  expected  cost 
will  also  have  a  piecewise  (but  not  quadratic)  structure.  We  have 
included  more  general  piecewise  structures  for  the  x-operating  costs, 
x-terminal  costs,  from  transition  probabilities  and  noise  densities 
in  the  JLPC  problem  formulation  because  the  piecewise-quadratic 
structure  of  the  solution  optimal  cost  is  lost  in  any  event. 

In  this  chapter  we  will  show  how  JLPC  control  problems  with  additive 
input  noise  can  be  reformulated  at  each  time  stage  k  as  different, 
equivalent  JLPC  control  problems  that  do  not  possess  input  noise.  These 


reformulated  problems  involve  an  artificial  variable  z  /  which  replaces  x 
These  reformulated  JLPC  problems  can  be  solved  using  the  approach 
of  chapters  5  and  8 : 

We  break  up  the  reformulated  JLPC  problem  into  constrained  sub¬ 
problems,  and  then  we  compare  these  subproblems  solutions  to 
determine  the  optimal  controller.  These  constraints  are  in  the 
artificial  variable  (instead  of  x^) . 

At  each  step  in  time,  the  control  problem  involving  the  search  for 
Vk(xk,rk=j)  results  in  a  set  of  noiseless,  constrained- in-zk+1  sub¬ 
problems  with  z-independent  form  transition  probabilities  and  single¬ 
piece  z-costs.  Each  subproblem  can  be  solved  analytically,  and  the 
resulting  subproblem  optimal  costs  are  compared  at  each  x^  value  to 

r^=j).  However,  since  the  subproblem  solutions  are  not 
in  general  quadratic  in  z^{  or  x^) ,  we  don't  have  the  nice  inductive 
solution  structure  of  the  problems  of  chapters  5  and  8.  At  each  time, 
the  analytical  steps  required  to  minimize  the  constrained  subproblems 
may  be  quite  different. 

We  propose,  therefore,  a  suboptimal  approximation  of  the  one- 
step  JLPC  control  problem  (at  each  time  stage  k)  that  results  in 
controllers  that  are  piecewise-linear  in  x^.  This  approximation 
method  constrains  the  controller  to  drive  the  system  to  one  of  an 
arbitrary  grid  of  points  (z  values) .  This  is  essentially  is  a 
brute  force  approach  which  is  subject  to  significant  error  as  the 
number  of  time  stages  of  approximation  increases.  Better  approx¬ 
imation  methods  that  utilize  knowledge  of  the  problem  structure 


obtain  (x^, 


can  probably  be  obtained,  at  least  for  certain  classes  of  JLPC 
problems.  We  have  not  investigated  approximate  solutions  in  detail 
here . 

This  chapter  is  organized  as  follows: 

1.  In  section  9.2  we  formulate  the  general  JLPC  control  problem 
with  additive  input  noise. 

2.  In  section  9.3  we  describe  how  the  basic  solution  approach 
of  chapters  5  and  8  must  be  modified  when  there  is  additive 
input  noise  in  the  x  dynamics.  We  use  two  example  problems 
to  illustrate  this  process. 

3.  In  section  9.4  we  solve  for  the  last  stage  controller  of 

a  JLPC  example  problem  (example  9.3).  The  example  problem  is 
the  same  as  example  8.5,  except  for  the  additive  input  noise. 

We  compare  the  resulting  controllers  for  each  problem  to 
illustrate  the  effects  of  the  additive  noise.  An  approximate 
solution  method  is  also  developed  for  example  9.3. 

4.  In  section  9.5  we  derive  a  general  one-step  solution  procedure 
for  the  JLPC  problems  that  are  formulated  in  section  9.2.  This 
procedure  is  patterned  after  the  examples  of  sections  9.3  and  9.4 

5.  In  section  9.6  we  establish  a  number  of  qualitative  properties 

of  the  optimal  one-step  JLPC  solution.  In  particular ,  we  describe 
active  hedging  in  JLPC  controllers. 

6.  In  section  9.7  these  results  are  then  used  to  construct  an  alao- 
rithm  for  the  efficient  determination  of  the  ootimal  controller. 
This  algorthm  is  presented  in  flowchart  form. 


In  section  9.8  we  demonstrate  the  application  of  the  optimal 
controller  derivation  algorithm  to  an  examole  Droblem.  This 
example  serves  to  illustrate  the  need  for  numerical  methods 
(as  opposed  to  analvtical  methods)  in  certain  steDs  of  the 
algorithm. 

In  section  9.9  we  consider  the  approximation  method  that 

was  discussed  above.  The  resulting  controller  is  apolied 

to  two  time  stages  of  the  example  of  section  9.8,  and 
the  performance  of  the  approximate  and  optimal  controllers 


are  compared . 


9.2  JLPC  Problem  Formulation 


In  this  section  we  formulate  the  jump  linear  piecewise  convex  (JLPC) 
control  problem  with  additive  input  noise  that  is  addressed  in  this 
chapter.  As  in  earlier  problems  we  restrict  our  attention  to  the  time- 
invariant  case  so  as  to  simplify  notation.  The  results  of  this  chapter 
can  be  directly  extended  to  the  time-varying  case. 

Consider  the  discrete- time  jump  linear  system  with  additive  input 
noise  and  scalar  x: 

Vi  -  a(rk’*k  *  blrk)u*  +  H<rk>vk  l9-1> 

Pr{rk+1«j  Ir^-i,  x^-x}  =p(i,j:x)  (9.2) 


x(k  )  =  x  r(k  )  =  r 

o  o  o  o 

Each  transition  probability  p(i,j;x)  of  the  form  process  is  assumed  to 
be  piecewise-convex  or  concave  in  x,  having  a  finite  number  of  pieces, 

.  That  is,  the  real  line  is  partitioned  into  v^_.  disjoint  intervals 
by  the  points 

-«  =  V. . (0)  <  V. . (1)  <,...,  V. . (V. .-1)  <  V. . (V. . )  *  00  (9.3) 

ID  il  H  il  il  il 


and 

for  s 


p(i,j:x)  *  (x;s)  if  (s-1)  <  X  <  V„  (s)  ,  (9.4) 

l,...,v^.  Each  function  X„(x,-s)  is  twice  continuously 
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differentiable  in  x  over  the  interval  (v_  (s-1)  ,v_  (s) )  and  either 


3  X.  . (x;s) 

- -  >  0  for  all  xe(v.  .  (s-1)  ,v.  .  (s) ) 

3x2  13  13 


s  V. . (x,s) 

- =* — -  <  0  for  all  xS  (v.  .  (s-1)  ,v.  .  (s) )  « 

ax2  -  12  12 


We  require  that 


p(i,j=x)  >_  0  Vi, j  and  x 

and  (9.5) 

M 

£  p(i,j:x)  =1  for  each  iSM.  at  each  xG3R  . 

j-1 

The  input  noise  process  {v^}  is  assumed  to  be  a  white  noise  sequence  with 
a  probability  density  that  is  piecewise-convex  or  concave  in  vfc,  having 
a  finite  number  of  pieces,  a.  That  is,  the  real  line  is  partitioned 
into  <7  disjoint  intervals  by  the  points 


-  4  a(0)  <  a(l) < _ <a(a-l)  -  a(a)  =  »  (9.6) 

and  at  each  time  k 

p(v^)  ■  u(vk;s)  if  a(s-l)  <  v^  <  a(s)  ,  (9.7) 

for  s»l,...,  Each  function  w(v;s)  is  twice  continuously  dif ferentiable 


in  v  over  the  interval  (a(s-l) ,a(s) )  and  either 


Since  p(v^)  is  a  probability  density  we  require  that 
p(v)  >  0  (at  each  v) 


and 

— OO 

Consequently 


(v)dv  =  1 


lim  p(v)  =  0 

V-*-±oo 


u»  -  0 


V-*±oo 


dv 


,  .  d  p  (v) 
lim  — ^ —  =  0 

v->±°o  dv 

By  assumption,  the  input  noise  is  white: 


Pr(v  | v  )  =  ?r(vk}  for  k  /  n 

fc  n 


(9.8) 


We  assume  (as  in  Part  III  and  Chapter  8)  that  the  state  (x^r^)  is 
perfectly  observed  at  each  k.  The  problem  is  to  find  the  optimal 
control  laws 


“k  *  ^k^Xo' "  '  ,Xk'  ro"”'rk^ 
that  minimize  the  cost  criterion 


(x  ,r  ) 

o  o 


j  N-l 

E  |  l  tukR(V*  5(Vi'Vi’l*  WV  J  • 

=  O  ) 

(9.10) 


where  the  expectation  is  over  {r 


, ...,r  }  and  the  input  noise  sequence 


As  in  the  JLQ  problems  of  Part  III  and  JLPQ  problems  of  Chapter  8  , 
we  assume  that  the  penalty  on  the  control  signal  is  quadratic,  where 

R ( j )  >  0  for  each  j  8  M  (9.11) 

The  x-operating  costs  Q(x,j)  and  terminal  costs  QT(x,j)  are  assumed  to 
be  piecewise-convex  or  concave  in  x,  having  a  finite  number  of  pieces, 

V?.  That  is,  the  real  line  is  partitioned  into  disjoint  intervals 
by  the  points 

-oo  A  p^(0)  <  <  ...  <  u^(u^-l)  <  y^(u'’)  -  00  (9.12) 

and 

Q(x, j)  =Q^(x;s)  if  (s-1)  <  X  <  y^(s)  ,  (9.13) 

for  s  ■  1, . . .  ,ilj. 

The  real  line  is  also  partitioned  into  rp  disjoint  intervals  by 
the  points 

_oo  £  nj(o)  <  njd)  £  ...  nj(nj-D  <  nj(nj)  -  «  (9.14) 

and 

QT(x,j)  =  Q^, <x? s)  if  rp(s-l)  <  x  <  ry*(s)  ,  (9.15) 

for  s  “  1, . . . ,n  • 

We  require  that  at  each  x  value, 

Q(x,j)  >_  0 

QT(x,j)>  0  .  (9.16) 

The  term  QT(x^,r  )  in  (9.10)  is  a  terminal  cost  charged  in  addition  to  the 


time-invariant  x-operating  cost  Q(xN/rN)*  Since  { i  k  -  kQ,..., 
is  a  Markov  process  we  need  only  consider  feedback  laws  of  the  type 

\  ■  WV  • 

The  JLPC  control  problem  formulation  (9.1)  -  (9.16)  includes  as  special 
cases  the  problems  of  chapters  3,  5-7  and  8.  In  the  next  section  we 
begin  our  analysis  of  this  problem  with  an  examination  of  the  effects 
of  the  additive  noise,  and  how  the  solution  approach  of  chapters  5  and  8 
must  be  modified  to  handle  this  noise. 

9.3  Reformulating  JLPC  Problems  with  Additive  White  Input  Noise  as 
Noiseless  Problems. 

In  this  section  we  will  describe  how  the  basic  solution  approach 

of  chapters  5  and  8  must  be  modified  when  there  is  additive  white  input 

noise  in  the  x  dynamics.  As  we  indicated  in  section  9.1,  this 

modified  solution  approach  involves  the  reformulation  of  the  noisy 

problem,  at  each  time  stage  k,  as  a  different  (but  equivalent)  control 

problem  in  an  artificial  variable  z^  (which  replaces  x^) .  This 

reformulated  problem  does  not  have  additive  noise  (it  is  absorbed  in 

z  ) .  We  will  derive  and  describe  this  reformulation  process  via  two 
3c 

example  problems. 

Defining  the  expected  cost-to-go  V^tx^r^)  as  in  previous  chapters, 
and  applying  dynamic  programming  from  finite  terminal  time  k  *  N,  we 


have  the  relationship: 


vw  ■ 


min 

"k 


uk  Rlrkl 


+  E 


l 


9(IWW 


Vi  ‘Vi-W 


(9.17) 


where 


for  1c  =  N-l,N-2, . . . ,k 


with 


VWj)  =  WrN  =j) 


Vj)  =•  nD 


(9.18) 


and 


6^(t)  =  nj (t) 

N 


t  =  i, . . . ,nJ  -  i 


From  (9.17)  -  (9.13)  we  can,  in  principle,  solve  for  the  optimal  controls 

Vi . "k  * 

o 

As  in  part  III  and  chapter  8  ,  let  us  define  the  conditional  expected 
cost-to-go  by: 


'WVxK-’’  -E  J  'WVu'W 


+  fVi'Vi1 


rk  * 3 


(9.19) 


*k 


This  is  a  function  of  x^+1.  We  can  rewrite  the  minimization  in  (9.17)  as 


I  2 


vvvjl  min  i\  *<rk> +  E|  Vi(viiv3)|| 

\  u 

The  expectation  in  (9.20)  is  over  values  of  the  input  noise  v^. 


(9.20) 


The  conditional  expected  cost-to-go  in  (9.19)  will  have  a  piecewise 
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structure 


Vi(xk+ilvj)  *  vk+i(Vi’t> 


fotxk+ls  4*l(t> 


t  =  l,...,^+]_  -  1  (9.21) 


where  the  x,  ,  intervals  are 
k+1 


4*i(t)  ii11'1 


(9.22) 


Here 


"4  4*i 101 4  4*i  (x> 4  - 


•' 4  4*ii4*r11 4  4+i(4*i>  -  “ 


(9.23) 


sure  the  unique  grid  points  obtained  by  superimposing  the  quantities 


(U1 (t)  :  t  «  1, . . . , u1  -  1} 


(x-operating  costs  grid) 


{v  (t)  S  t=l, . . . ,v  -  1} 

J 1  J 1 


(form- transition  probability 
grid) 


«4+i!t)  !  s  =  1 — '”k*i(i)' 


/,vk+i<\*rtk*i"i)'\ 

\  joining  points)  / 


for  each  i  S  C  .  . 

3 


(9.24) 


The  solution  approach  of  chapters  5  and  8  might  suggest  that  the  way 
to  solve  (9.20)  is  to  convert  it  into  the  comparison  of  a  finite  set  of 
constrained-in-xk+1  subproblems : 


Vk(xk'rk“j)  =  min 


t*l  t  •  •  •  rip. 


j  "'k'VVi1!  Vi'ii11" 

k+1 
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where  the  subproblems  are 


=  mm 


\ 


s.t. 


{u  R(i)  +  V  (X,  ,  r,  =j)  } 

k  k+1  k+1  k 


Vl  6  4+1(« 


This  will  not  work  because  of  the  additive  white  input  noise.  We 
cannot  choose  x^+1  with  certainty.  Thus  the  subproblems  above  are  not 
well  posed. 

Let  us  now  consider  an  example  problem  that  is  identical  to  example 
5.1  (and  6.1,  7.1)  except  for  the  presence  of  additive  input  noise. 

From  its  solution  we  will  develop  a  general  method  for  the  reformulation 
of  (9.1)  -  (9.16)  as  a  comparison  of  constrained  subproblems  that  are 
constrained  in  a  deterministic1  quantity. 


Example  9.1  (Uniformly  Distributed  Input  Noise) 

Consider  the  following  system  having  uniformly  distributed  (in 
magnitude)  bounded  white  driving  noise  and  m  =  2  forms : 


*k+l  *  xk  s 

'  \  +  Vk 

if  r  =  1 
k 

Vl  *  2xk 

+  \ 

if  r.  =2 
k 

p (1,2 :x)  - 

|  1/4 

if 

1*1  <  1 

|  3/4 

if 

1*1  >  i 

p  (1 , 1  sx)  = 

1  -  p(l,2 

:x) 

p  (2 , 2) 

*  1  p  (2 

deterministic,  given  x  ,r  and  u.  • 


1 


where  the  input  noise  sequence  { v  }  is  a  white  noise  sequence  with 
uniformly  distributed  magnitude: 


density 


hence 


p(v)  = 


■  c 


E{vk}  =  0 


v  >2 


1/4  v  <  2 


•» 


dv  =  4/3 


(all  k) 


E{vk  vs}  -  0 


if  It  s 


We  seek  to  minimize 


u  , . . . ,u  .  < k=0 

o  N-l 


E  .X  Lxi  +  4+1 1  +  XN  W 


where  K^d)  =  0,  KT(2)  =  3. 

Once  the  system  attains  form  r  =  2 ,  it  stays  there  and  the 
usual  LQ  solution  applies: 


VW2>  •  \  Vl,: 


where 


WV2  =  -Lk(1:2)xk 

Vls2)  -  K^(2)  =  3 

a2 (2)  R(2) [K. +1 (1:2)  +Q(2)]  4 (1:2)  +  1] 


R(2)  +  b2(2)  (Kk+1(l:2)  +  Q(2)  3  2  +  *^(1:2) 


Lk(l!2) 


a (2)  b(2)  £K5c+1  (1:2)  +  Q (2)  3 


2[iW1:2)  + 


R(2)  +  D  (2)  CKjc+1  (1:2)  +  Q  (2)  ]  2  +  ^^(1:2) 


Now  let  us  examine  what  happens  in  form  r  =  1.  We  are  given  that 

N-l 

2 

VN(xN,rN=i)  =  x  Kt=  0.  Applying  dynamic  programming , 


ViVi'Vr11 


min  (Vl  *  E  {  N  +  VW 

UN-1 


2  \  2 

XN-1 

Vl"1 
Vl 

(9.25) 


=  111111  Vl  +  EJ  XN[p(1,1:XN)  +  4P(X'2:xn)J 
UN-1 


N-l 


Vi"1 

Vl 


min 

Vl 


jVl  +  EjXN  +P(1'lsV  VN(XN'rN=1) 


+  p(1'2:V  VVV2> 


The  expectation  is  over  values  of  the  noise  v„  , .  We  can  influence  the 

N-l 

probabilities  pddsx^)  and  p(lf2:x^)  by  our  choice  of  uN  ,  but  we 
cannot  precisely  specify  them  because  the  value  of  xN  depends  upon  the 
noise  vN_^  as  well  as  xN_1  and  . 


Substituting  the  noise  density  in  (9.25), 


min  4 

Vi 


u. 


N-l 


r 


t’<v)  Vi +  v1+v) 


P (1 > ^ :Xn-1+UN-1+v 
+d 


4pd,2:XN_1+UN_1+' 


min 

Vi 


{v^l/  <Vi  *  Vi  +  ->2  P(1'1!Xh-i*Vi-> 


-2 


4p(l,2:XN_1+Vl+v)j 


If  u  ,  is  chosen  so  that  lx  +  u  >3  then 
N-l  1  N-l  N-i 

lx  I  =  lx  +u  +v  I>1  for  all  possible  values  of  v  ,  and  there¬ 
in  1  1  N-l  N-l  N-l1  N"1 

fore  P(1,2:xn)  =  3/4  and  pd/lsx^  =  1/4.  That  is,  for|  +  uN_J  > 

in  this  example,  the  value  of  the  input  noise  v^_^  will  not  affect  the 


form  transition  probabilities;  we  will  have 


1  f2,  ,2p<1'1:XN-l+Vl+' 

4  J  'VlV 

L“P(l-2=Vl+Vl1 


mTsf  "mV' 


13.  ,2  13 

—(Vi  *  “n-i'  *  -  • 


(9.27) 


If  we  chose  u ,  ,  so  that  x  ,  +•  u ,  J  <  3  then  the  form  transition 
N-l  N-l  N-l 

probabilities  will  depend  upon  the  value  of  the  input  noise.  In  this 


i  r2  7  rp(i,i:x 

I j  {XN-1+UN-1+V) 

~2  L4d(1.2: 


2  rpu,i*vi  v*” 


,p(1'2iViViwl 


‘TSj  ‘Vi*Viw|2dv  *  I gj  'VA-i"1' 


•‘•'ViV 


13  P 

■  16, J  'VlV  dv 

1‘  1  ViV 


•  H  'Vi  +  Vi’2  +  49/12- 


(9.28) 


Thus  we  have  different  minimization  problems,  depending  upon  the  value 


°£  <XN-1  *  Vl>  • 
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Surprisingly,  in  each  case  the  expected  cost  term 


E  us +  vw1  Vi,rw'1,vi' 


(9.29) 


is  only  quadratic  in  (x^  +  u^_^) .  This  will  not  be  the  case,  in 

general,  for  problems  of  type  (9.1)  -  (9.16)  having  piecewise-constant 
form  transition  probabilities,  piecewise  constant  noise  densities,  and 
piecewise-quadratic  x  costs.  As  we  will  see  in  the  next  example, 

the  cost  in  (9.29)  is  generally  cubic  in  (x^  +  u  )  for  such  problems. 

The  expression  in  (9.28)  does  not  contain  cubic  terms  because  of  the 
symmetric  nature  of  the  limits  of  integration  and  the  fact  that 
p(l,2:x),  p ( 1 , 1 :x)  and  p(v)  all  have  only  three  constant  pieces.  We 
have  chosen  to  study  this  somewhat  unrepresentative  example  problem 
here  because  it  highlights  the  solution  approach  for  handling  JLPC 
problems  possessing  additive  input  noise,  without  introducing  extraneous 
complications . 

The  following  strategy  for  computing  V,  . (x„  . ,r  =1)  and  the 

N-l  N-l  N-l 

associated  optimal  control  law  is  suggested  by  (9.26)  -  (9.28): 


For  each  of  regions  of  (xN_^  +  Uj^)  values,  solve  the  constrained 

optimization  problem  that  assumes  (xN_1  +  uN_1)  is  in  the  specified 

region.  Once  we  have  the  solutions  to  these  problems,  we  compare 

them  and  obtain  the  optimal  solution  by  choosing  the  smallest  of 

these  for  each  value  of  x„  , . 

N-l 

As  we  have  indicated,  in  this  example  there  are  three  (x„  +u  .)  regions 

N— 1  N-l 


(1>  V!  +  Vl  -  -3 


where  p(l,2:x  )  =  3/4  /■ 
N 


(2)  “3  -  Vi  +  Vi  -  3 


Vi  +  Vi  - 


where  p(l,2:x)  =  1/4  if 
N 

■‘■'ViV  '  Vi  <  1  - 

and  pd^sx^)  =  3/4  otherwise  / 
where  p(l,2:x„)  =  3/4 

N 


The  three  corresponding  constrained  control  problems  are: 


VN-l(xN-l'rN-l=1 '1) 


=  min 


ViVi  i  -3 


(9.30) 


VN-ltXN-l'rN-l  1'2) 


(vi  R(Vi  +  Vi 


‘3  iViVi-3 


(9.31) 


ViVrvr1*3  “  ”in  {vi +  ^t(xn-i  +  Vi)2  +  } 

Vi  s’t* 

XN-l+Vl-3+ 


(9.32) 


The  costs  in  the  first  and  third  problems  are  the  same,  because  of 

the  svmmetrv  of  p (1 , 2 :x)  and  p(v)  about  zero. 

N 

Consider  the  second  +  u^^)  region: 


‘3  ±Vi  +  Vi  i  3 
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Differentiating  V  (Vl'Vl=1 ' 2)  in  (9‘27)  with  resPect  to  UN 


=  (19}  -  +  (11)  u 

1  2  N-1  1  2  N-l 


3V] 


3  VN-1(xN-1,1I2)  _  23 

a  2  2  ° 

3Vi 

hence  setting  (9.29)  to  zero  yields  the  minimizing 


-19v 


V^Vi'Vi-1'2*  =  (^I>  Vi  "  (-82609)  Vl 


with  the  resulting  cost 


Vl(Vl'Vl  1‘2) 


llf>  Vi  *  S  -  ““'Vl  + 


But  uN_1  in  (9.35)  solves  (9.31)  only  if  the  resulting  (x^  +  ^ 

value  satisfies  the  constraint 


1  Vi  *  vJ  -  3‘ 

This  holds  if  and  only  if 

-17.25  <  x„  <  17.25 

N—  1 

From  (9.34)  we  see  that  for  x^^  >  17.25,  the  best  choice  of  ^ 
that  satisfies  (9.37)  is 

'Vi  3  Vi 


-1: 

(9.33) 

(9.34) 

(9.35) 

4.0833 

(9.36) 
-l5 

(9.37) 


The  resulting  cost  is 


Similarly  for  x  ,  <  -17.25,  the  best  choice  of  u  ,  that  satisfies  (9.37) 
N-l  N-l 

is 


with  resulting  cost 

VN-1  =  XN-1  +  6xN-l  +  335/6 

Thus  the  optimal  cost-to-go  in  (9.3l)  has  a  three-piece  quadratic 

structure  in  x„  , . 

N-l 

The  other  two  constrained  control  problems  (9.30),  (9.32)  have  two- 
piece  quadratic  structures.  The  optimal  expected  costs-to-go  for  all 
three  constrained  subproblems  are: 


VN-1(XN-1'1 


VN-1(XN-1  1l2) 


ViVi-1!31 


(.7647)  X^_1 

Vl  +  6xn-i 


x  +  (6x 
N-l  N-l 


(.  82069) xN_1 

Vi  -  6Vl 


6x 


N-l 


+ 


.7647  x 


2 

N-l 


+ 


+  4.333 

if 

x„  ,  <  -12.75 

N-l  - 

+  42.58 

if 

x  .  >  -12.75 

N-X  — 

(9.38) 

+  55.83 

if 

x  ,  <  -17.25 

N— X  — 

+  4.083 

if  -17.: 

25  <  x„  ,  <  17.25 
—  N—  1  — 

+  55.83 

if 

17‘25  'XN-1 

(9. 

42.58 

if 

Vil12*75 

4.333 

if 

x  .  >12.75 

N-l  — 

and  the  corresponding  control  laws  are 


(9.40 


WVi'1115 


-.7647  x 


-x. 


N-l 


N-l 
-  3“ 


if  x  ,  <  -12.75 

N-l  — 


if  x  ,  >  -12.75 

N-l  — 


(9.41) 


Vi(Vi'l|2) 


•Vi  ■  3 


if  x  ,  <  -17.25 
N-l  — 


-.82069  x. 


N-l 

Vi  +  3" 


if  -17.25  <  x,  ,  <  17.25 
N-l 


if  17.25  <  x  . 

—  N—  1 


(9.42) 


Vi'Vi'1!31 


"XN-1  +  3 


if  x  .  <12.75 
N-l  ~ 


-.7647  x, 


'N-l 


lfXN-l~  12,75  (9.43) 


Having  solved  the  constrained  problems  (9.30)  -  (9.32)  we  are  now  ready 
to  compare  them: 


ViVrVi'11  •  ,  ViVrvr11'1 


(9.44) 


This  is  done  graphically  in  figure  9.1.  Choosing  the  lowest  of  the 


three  constrained  costs  at  each  x„  ,  value,  we  see  that  the  optimal 

N— 1 


expected  cost-to-go  of  control  law,  and  optimal  (uN_1  +  xN  1)  values 


are  as  listed  in  table  9.1. 


V  ,  (x  , ,r  ,=1)  in  example  9.1  and  subproblem  costs; 
N—l  N-l  N-l 

VN  ^ , l|  1 )  is  indicated  by  the  dashed  line, 

vn-i(xn-i,:lI  2)  by  ^  dotted  line  and  vn-i(xn-i,:l  ^3)  by 
dot-dash  line.  The  optimal  cost  VN_]_  ^xn-1'  rN-l 

indicated  by  the  solid  overline. 


-1)  is 


Vi  <  -12'75 

-12.75  <  xxT  .  <  -8.655 
N-l 

-8.655  <  x%T  ,  <  8.655 
N— 1 

8.655  <  xXT  ,  <  12.75 
N-l 

12, 75  <  *n-i 


( .7647) x2  ,  +4.333 
N— 1 

4  *  6Vi  +  42-53 

(.8207)x2_1  +  4.083 

Vi  '  6Vi  +  42'53 

(.7647)  +  4.333 


-.7647 


N-l 


-x. 


N-l 


3 


-.8207 


N-l 


-.7647 


Vi 


•2353  Vi 

-3~ 


.1793  x. 


N-l 


.2353 


Vi 


Table  9.1  Optimal  Expected  Cost-to-go,  Control 

Law,  and  x  +  a,  .  from  (x  ,  r  =1) 
jj-1  N-l  N— 1  N-l 

in  Example  9.1. 


In  figure  9.2  we  can  compare  the  last-stage  solution  of  this  problem 
and  the  noiseless  version  of  this  example  (example  5.1  in  Section  5.3). 

We  make  the  following  observations: 

1.  In  both  example  9.1  and  example  5.1  the  optimal  expected 

cost  V  , (x„  , ,r„  =1)  is  piecewise-quadratic  in  x„  , 

N— 1  N-l  N— 1  N— 1 

and  the  optimal  control  law  is  piecewise-linear.  When  we 

go  back  another  stage  in  time,  the  optimal  cost 

VN_2 (XN-2'rN-2=l)  can  obtained  using  a  similar  approach. 

The  piecewise-quadratic  structure  will  be  lost,  however. 


in  example  9.1. 


2. 


The  endpiece  control  laws  in  both  examples  are  the  same: 


Vi'Vi'Vi-11  ■  ‘-•7647)  Vi  • 


In  both  examples  this  control  law  corresponds  to  making 


with  certainty.  In  example  5.1,  where  there  is  no  input  noise, 
this  is  done  by  making 


1  Vl  +  Vl> 


>  1 


In  example  9.1  we  must  make 


Vl  +  Vl' 


to  guarantee  that  |x  |  >  1  (since  no  matter  what  value  in 
(-2,2)  the  noise  ,  takes,  we  will  have  Ixl  >  1) . 

N-i  1  N 1 


3.  In  example  5.1  we  use  control 


=  -x..  .  =  1 


Vl  =  “Vl 


to  hedge  to  point  x„  =  -1  when 

N 


8.65  i  VliVl®  '  12'75' 

In  example  9.1  we  cannot  place  xN  with  certainty.  However 
we  can  place  (x„,  ,  +  u„  . ) .  In  example  9.1  when 

N“1 

8.65  <  xM  ,  <  12.75  we  use  the  control 

—  N-l  — 


Vl  ■  -Vl  -  3 
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to  obtain 


<Vi  +  Vi1  -  *3  ♦ 

This  corresponds  to  using  control  ^  to  guarantee 
xN  £  -1  with  certainty. 

Note  that  in  example  5.1  we  hedge-to-a-point  to  get  x„  inside 

N  - 

(-1,1).  in  example  9.1,  however,  we  hedge  to  keep  xN 
outside  (-1,1)  even  though  the  probability  of  failure  is 
larger  there.  The  reason  is  that  for 

8*65  <  lxN_i !  <  12.75 

it  is  better  to  keep  ^  in  the  disadvantageous  p(l,2:x) 

pieces  lx  I  >1  with  certainty  (by  making  x ,  ,  +  u  ,  >3) 

than  it  is  to  risk  having  either  lx  I  >  1  or  lx  |  <  1 

N  1  N  1 

(by  making  |xN_1  +  u  |  <  3).  That  is,  it  is  not  worth 
soendina  extra  control  enercrv  to  trv  to  get  xN  inside  the 
advantaaeous  p(l,2:x^)  piece  because  we  can  not  do  it  with 
certainty . 

As  in  the  JLQ  problems  of  chapter  5,  the  optimal  controller 
has  regions  of  avoidance.  However  for  example  9.1  they 
sure  regions  of  (xN_^  +  “N_1)  avoidance  (instead  of  xN 
avoidance).  From  Table  9.1  we  see  that  the  optimal 
controller  chooses  uN_1  so  that  (xN  ^  )  does  not 

take  values  in  the  intervals 

(~3+,  -1.55)  and  (1.55,  3~) 


Example  9.2  Let  us  now  modify  example  9.1  so  that  the  expected  cost 
term 


Et4  +  WV'Vi'Vi*1, 


(9.45) 


is  piecewise-cubic  in  (x^  +  ) .  Consider  the  control  problem  of 

example  9.1,  but  with 


p (1 , 2  :x) 


1/4 

if 

x  <  1 

3/4 

if 

x  >  1 

(9.46) 


In  form  r  =  2  the  solution  is  the  same  as  in  example  9.1,  and 
(9.25)  -  (9.26)  apply  in  form  r  =  1. 

If  uN_1  is  chosen  so  that  (x^  +  u^)  >  3  then  xN  >  1,  and  there 
fore  p(l,2:x  )  -  3/4.  The  expected  cost  (9.45)  in  this  case  is  given 
by  (9.27). 

If  u^j  ^  is  chosen  so  that  (xN-1  +  u^^)  <  then  xn  <  1  witil 
certainty,  and  therefore  p(l, 25X^5  *  3/4.  When  this  is  the  case, 


,p(1'lsVl  +  U  N-l  +V) 

+ 

4p(l,l:xN_1  +  uN_1  +  v)J 


dv  = 


is -  j  <vi +  v/  +  in-  l9-47> 

-2 

If  we  choose  u^_^  so  that  -1  <  (xN_^  +  UN—1 )  <  3  then  the  form  transition 
probabilities  will  depend  upon  the  value  of  the  input  noise.  In  this 


case, 


r  2  2  pP^'^Vl  +  Vl  +  v)  ' 

4  J  (Vl  +  Vl  +  V) 

L4p(l,l:XN_1  +  Vl  +  v). 


^Vl+Vl* 


=  -f 

16  J  . 


X  N— 1  +  Vl  +  V)  dV  +  16 


if  ‘Vi 
l-‘Vi+Vi> 


‘Vl  *  Vl  +  vl  dv 


(Zt>  <Vi  +  Vl’3  *  ‘f>  ‘Vl  +  Vl’ 


('2)  (XN-1  +  UN-1)  +  24 


(9.48 


Thus  the  expected  cost  in  (9.45)  is  piecewise-cubic  in  (xN  +  ^  . 

V  (x„  , ,r_  =1)  is  obtainable  by  comparing  the  solutions  of  the 

N-l  N— 1  N- 1 

constrained- in- (xN_1  +  uN_1)  subproblenvs . 


Vi'Vi'Vi'1!11  „ 

Vl  s-t- 


(Vi4Vi,<-1 


<  2  7  ,  ,  2  _  .. 

iVi  +  4  ‘Vi  +  Vl1  +  7/3 


(9.49 


V  (x  ,r  *1  2)  *  min 

Vi'Vr  s-i 


s.t.  . 


■1<(Vl+Vl)<3  3 


Vl  "  8(XN-1  +  Vl*  +  2(XN-1+Vl) 


-  2<Vl  +  Vl1  +  83/24 


Vi(xn-i'  Vi”1! 3  *  min 


(■Vi  *  l‘Vi  +  Vi’2  +  ^  I 


We  will  defer  solution  of  these  subproblems  and  the  determination  of 

V„  i  i'rM  .  *1 )  until  section  9.8. 

N-l  N-l  N-l 


a 


Let  us  return  to  the  general  control  problem  of  this  chapter.  We  can 
follow  the  idea  used  in  the  above  examples  and  reformulate  (9.20)  as  a 
comparison  of  a  set  of  constrained  subproblem  solutions.  Let  z 

k+1 

be  the  value  that  would  have  given  r  and  assuroina  that  the  noise 

is  zero: 

Vi  ■  a(rk’ V  bltk)uk  • 

That  is 

Zk+1  3  Vi  -  H(rk)vk  • 


(9.52) 


(9.53) 


We  define  the  z-conditional  expected  cost-to-go  as  follows: 


Vi(zk+iiVj)  =  E 


Vk+1  (xk+-l,rk+l) 


2(Xk+l'rk+l} 


rk  3  3 


Zk+1  =  a(j)xk  +  b(j)uk. 


=  E 


Vi+1(a(j)xk  +  b(j)uk  +  H(j)vk,rk+1) 


(Q(a(j)xk  +  btj)^  +  H(j)vk,rk+1) 


ri,  -  3. 


V’Sc 


Vk+i(xk+ilvj) 


(9.54) 


V3 

Xk'Uk 


The  minimization  in  (9.17),  (9.20)  then  becomes 

VvV  *  min  K  R(rk>  ♦  Vi<zk+ilrk  j) 

“k  ( 


(9.55) 
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As  we  shall  see  in  later  sections  of  this  chapter,  the  behavior  of 


this  z-conditional  expected  cost  function  is  intimately  related  to 
qualitative  properties  of  the  optimal  controller  and  combinatoric 
properties  of  the  solution. 

We  can  solve  for  V  )  in  (9.55)  by  comparing  (at  each  x^ 

value)  a  finite  set  of  constrained- in-z^^  subproblems: 


tj  'VW^Vi  *  ^+I(t)) !  -  (9-56) 


t— 1  f  ...  ftp, 


k+1 


where  the  ;{^  subproblems  are 


Vk(xk,rk  jlZk+l  6  ^k+l^5  “  mn 

uk  s.t 


\  R(j)  +  Vk+l(2k+lirk=j: 


k+1 


6At+ 1 


(9.57) 


41 


=  VW3^ 


and  the  A,  ,(t)are  intervals  of  z,  values. 
k+1  k+1 


In  principle,  we  can  solve  for  the  optimal  controllers  for  (9.1)  - 
(9.16)  at  each  time  stage  k  and  in  each  form  j  6  M  if  we  can  solve 
(9.55)  -  (9.57)  subject  to  (9.52)  -  (9.54). 

We  have  now  incorporated  additive  input  noise  in  the  problem 
solution  approach  of  chapters  5  and  8,  by  reformulating  the  noisy 
JLPC  problem  (9.1)  -  (9.16)  as  a  comparison  of  subproblems  constrained 
in  the  deterministic  quantity  z^+^  (given  u^,x^,rk) .  Note  that  if 


there  is  no  input  noise  then  z  =  x.  .  and  V  (z  r  =j) 

K+1  K+1  K+1  K+1  K 


k+i(xk+i  v31- 


V 


Two  issues  must  be  resolved  before  we  can  use  this  reformulation 


to  solve  (9.1)  -  (9.16): 

(1)  How  do  we  obtain  the  partition  of  zk+1  values  (that  is, 

A  .  A  . 

the  intervals  A^+^(t);  t  =  l,...,^k+^)  in  (9.56)  -  (9.57)? 


(2) 


How  do  we  solve  the  subproblems  in  (9.57)  when  (zk+j_ I 
is  not  piecewise-quadratic  in  zk+-j_? 


We  address  these  questions  in  the  remainder  of  this  chapter. 


9.4  Solution  of  a  One-Stage  Example  Problem: 

In  the  previous  section  we  described  how  a  JLPC  problem  specified 
by  (9.1)  -  (9.16)  can  be  transformed  into  the  comparison  of  a  finite 
number  of  constrained- in-zk+1  subproblems.  In  this  section  we  will  carry 
out  this  reformulation  for  one  stage  of  an  example  problem,  and  we 
will  solve  for  the  optimal  controller. 

As  we  obtain  the  solution  of  this  example  problem  we  will 
make  a  number  of  observations  regarding  JLPC  problems  possessing  additive 
noise.  Using  insight  gained  from  the  solution  of  this  example  problem, 
we  will  develop  a  general  one-step  solution  procedure  in  section  9.5, 
and  two  approximate  (suboptimal)  controllers  in  section  9.9  . 

Example  9.3:  This  example  is  the  same  as  example  8.4,  except  for  the 

inclusion  of  additive  input  noise.  It  involves  x-costs  that  are  piece- 
quadratic  in  x,  form  transition  probabilities  that  are  piecewise  - 


5 


constant  in  x,  and  a  piecewise-constant  input  noise  density. 


(normal 

operation) 

(failure) 


Vi  55  *k 


2  forms  where 

\  +  vk 

if  r  =  1 
k 

if  r.  =2 
k 

1/4  if  { x | 

<  1 

3/4  if  | x | 

>  1 

-  p  (1 , 2  :x) 

p(2,2)  =  1  P(2,l) 

and 


(noise 

p(v)  =  /  3/8 

if 

|v|  c  i 

density) 

|  1/8 

if 

1  <  |v|  <2 

(o 

if 

|v|  >  2 

hence 

E{vk}  =  0  , 

E<\> 

=  5/6 

El\  V*  0  . 

the  notation 

of  section  9.2  we 

have  noise-density 

V  k 


for  k  ft  s- 


0(1)  =  -2 


a(3)  =  1 


(0=5  pieces) 


We  seek  to  minimize 


N-l 


min 


N-l 


E  {J0  [\  +  +  WrM,} 


and 


where 


2(xk+i'rk+i=2)  =  0 

WV2)  =  100°* 


2T(XN'rN  15 


k+1 


k+1 


<  - 


-2xk+l  - 


1.  Sx. 


'k+1 


2(xk+l'rk+lal)  = 


"2xk+l  +  1,5xk+l 
2 

Xk+1 


-'5  <  Xk+1  <  ° 


0  <  Xk+1  "  *5 


.  5  <  x. 


k+1 


As  in  example  8.4,  in  form  r  =  2  (i.e.'in  the  failure  mode)  the  optimal 
expected  cost-to-go  and  control  law  is 


Vw2)  - 1000 

\(VV2)  ■ 0 

at  all  times  k.  In  form  r=l  we  have 


VW1’  ‘  0 


9 


As  in  example  8.4,  the  conditional  expected  cost  VN (*N!  rN-l=1* 
for  this  problem  is 


f  2 

VN(V1}  =  (*25)XN  +  750 


Vn(xn;2)  =  (.75)^  +  250 


VXN  ^l*15 


W3)  3  (-1*5)xn  *  (*375)xN  +  250 


Vn(xn;4)  =  (-1.5)xn  +  (.375)xn  +  250 


VV5)  -  (.75)xn  +  250 


VM(xM;6)  =  ( . 25) x  +  750 

N  N  N 


V 


<  -1 


-1  <  *N  ‘  '-5 


-5  <  <  0 


0  <  x  <  .5 
N 


•5  <  xN  <  ! 


1  <  x 


N 


(9.58) 


where  we  have  a  partition  of  xN  values  with  ^  ■  6  pieces,  specified  by 


V1} 

3  -1  -  *  V5) 

V1’ 

rH 

1 

8 

l 

D 

V41 

=  (0, 

.5) 

YN<2) 

3  -5  3  "V4) 

V21 

=  (— 1 ,  —  -  5 ) 

V5> 

=  (.5 

,D 

V3> 

=  0 

V3> 

=  (.5,0) 

V6’ 

*  (1, 

°°) 

(9.59) 


v  (x„lr  =1)  is  discontinuous  at  x„  *  ±1  (the  form  transition  probability  I 

N  N  1  N*1  N 

discontinuities) . 

The  grid  points  in  (9.59)  are  joining  points  of  the  x-operating 
cost  Q(x,r=l)  and  discontinuities  of  the  form  transition  probability 
p(l,2:x) . 


8 


<  <  > 


Now  let  us  compute  the  z-conditional  expected  cost-to-go  V„„(z  Ir  , 

N  N  1  N-l 

From  (9.54)  we  have 


a  /•  00 

VZNlrN-l =1)  V  P(^  W  VlrN-i”1)dv  • 


(9.60) 


In  this  example  the  input  noise  density  p(v)  is  piecewise-constant  with 
0-5  pieces  and  p(v)  =0  over  the  leftmost  and  rightmost  pieces.  Here 

/N 

VXNlrN-l=U  11213  %  =  6  Pieces-  Thus  we  can  rewrite  (9.50)  as  a  sum 


of 


integrals : 


(a  -  2)^  =  18 


a"1  ^  r max[a(s-l) ,min[YN(l)-zN,a's) 3} 

\  /* 

/  jJ<v:s)  VN(zN+ v,-l)dv 


a(s-l) 


N{ZN|rN-l=1) 


^  mavTfT  l s-1 


J  TN  max[cr(s 


max[cf(s-lT  ,min(Y„(t)-z  .ct(s)  )  T 

N  N 


s)  -V-(z„+v;t)dv> 

- -  N~ . 


min [a(s) ,max  CYM(t-l)-zM,a(s-l)  ] 

N-  --  N  _  _ 


S-2 


a(s) 


+  f  d)(v;s)  VN(zN  +  v;l|/N)dv 
^  minta(s),  max  {^(i^-l)-zN,  a(s-l)>}} 


(where  V  (z  +  v:t)  denotes  the  tth  piece  of  VM(z  +  v  |r  =1),  as  in 
N  N  N  N  N  N— 1 

(9.21),  and  p(s:v)  denotes  the  piece  of  noise  density  p(v  )  as 


(9.61) 


in  (9.6),  and  the  Y„(t)  are  as  in  (9.59)). 


The  numerical  values  of  the  limits  of  integration  in  (9.61)  depend 

A 

•  * 

upon  the  value  of  z  .  This  gives  VN(ZNI rN_i=1)  a  Piecewise  structure 


in  z 


N 


We  have  a  boundary  between  pieces  of  (^1  rN_]_~U  at  eac^  ^ 
value  where  one  (or  more)  of  the  limits  of  integration  (9.61)  changes. 
It  is  straightforward  to  verify  that  for  each  s  =  2,3,...,cj-1  =  4 


and  t  =  1,2, . . .  ,i[i  -1*5  we  have 


min(rN(t)  - 

z.T,o(s) ) 

N 

=  min{<T(s)  ,  max(yN(t)  -  zN, 

a (s-1) ) }= 

/  a(s) 

if 

ZN  1  YN(t)  "  a(s) 

i 

z 

>- 

n 

if 

YN(t)  -  a(s)<  zN  <  YN(t) 

-  o(s-l) 

(  a(s-l) 

if 

Yjj (t)  -  a(s-i)  <  zN 

• 

(9.62) 


A 

The  boundaries  of  the  V  (z  Ir  =1)  pieces'  domains  are  the  set  of  values 

N  N1  N-l 


(yN(t)  -  a(s) 


3*2, .  . .  ,a-l;  t=l, - 


which,  for  this  example,  are 

{—3,  —2.5,  -2,  —1.5,  — 1#  —.5,  0,  .5,  1,  1.5,  2,  2.5,  3}  . 

Ordering  these  13  quantities  from  smallest  to  largest,  and  denoting 

A 

them  by  YN(t)  (t=l,...,13)  we  obtain  a  partition  of  the  real  line 

Aw 

of  z  values  into  tjj  *  14  intervals, 

N  N 


Vt)  ■  YN(t)) 


t  =  1, ... ,13 


^  A  ■'*'>  . 

Where  Y^O)  ■  -  •  ,  YN(14)  -  +  *>  as  follows: 


A  (1)  = 
N  ' 

(Yn(°),  Yn(1)) 

=  (-*>,  -3) 

V2’  =■ 

A 

(Yn(1),  Yn(2)) 

A  A 

-  (-3,  -2.5) 

V3’  * 

(Yn(2),  Ym(3)) 

-  (-2.5,  -2) 

V4’  - 

(Yn(3),  Yn(4)) 

*  (-2,  -1.5) 

4(5,  . 

(Yn(4),  Yn(5)) 

«  (-1.5,  -1) 

V6>  * 

(Yn(S),  Yn<6)) 

*  (-1,  -.5) 

v7>  - 

<Yn(6),  Yn(7)> 

-  (-.5,  0) 

3 

n 

A  A 

(Yn(?),  Yn(8)) 

-  (0,  .5) 

V9>  - 

(Yn(8),  Yn(9)) 

-  (-5,  1) 

V10>- 

A  A 

(Yn(9),  Yn ( 10 ) ) 

-  (1,  1.5) 

A 

A  A 

A^ (11)  - 

(YNd°)  ,Yn(1D) 

»  (1.5,  2) 

An(12)  « 

a  A 

(Yn(U)  ,Yn(12)) 

»  (2,  2.5) 

A 

A  A, 

An(13)  - 

(V12)'V13,) 

*(2.5,  3) 

yn)  - 

A  A 

(V13)  'yn(14)) 

*  (3 ,°°) 

(9.63) 


Applying  the  integration  limit  values  specified  by  (9.62)  to  the 

A 

computation  of  (9.61),  we  can  obtain  each  VN(zN|rN  »1)  piece. 

A  These  calculations  can  be  simplified  if  we  first  calculate 

*  /N 

VN^ZN:^  (i*e*  for  6  ^j(D)  and  we  then  successively  calculate 

W  A  /N 

(for  each  t  *  2,...,  $N~1)  the  piece  VN(zN?t)  from  VN(zN;t-l),  by 
adding  or  subtracting  (as  appropriate)  those  integrals  in  (9.61) 

A  A 

whose  limits  change  when  we  move  from  (t-1)  to  A  (t) . 
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Eor  z„  6  A  ( 1)  =  (-«,  3)  we  obtain 
N  N 


W11  ■/  5  VZN  +  Vl11  dV  *J_1  I  *  V1 


1)  dv 


"/  SVZN  *  V,1)dT 


(9.6 


=  (.25)z„  +  750.208 

N 


For  z  6  A„(l),  each  of  the  other  fifteen  integrals  in  (9.61)  has  the 
N  « 


same  upper  and  lower  limit  of  integration;  hence  they  are  zero.  As  is 


evident  in  (9.64) ,  x„  =  z„  +  v  is  in  A  (1)  for  any  noise  magnitude  in 

N  N  N 


(-2,2) .  Consequently  V  <z  :1)  is  quadratic  in  z  . 

N  N  N 


For  zN  S  AJJ(2)  =  (-3,  -2.5)  we  have 

2 


Vv2)  ■  Vv11  *f  s'Vs*  Vi2)  -  V2n  +  v,1))d'’ 


— 1— z. 


N 


=  ( .020833) z^  +  ( . 375) zf  -  (62.25)z„  +  562.896. 

N  N  N 


Following  this  procedure  for  the  remaining  z„  intervals ,  we  obtain  the 

N 


z-conditional  cost  for  this  example; 


This  z-conditional  cost  V  (z  Ir  =1)  is  continuous  m  z  .  The 

N  N  1  N-l  N 

/■s 

additive  noise  smooths  out  the  discontinuities  in  (x^  jrN_^=l)  at 

■s.  * 

/\ 

For  this  example  problem  the  ij»  =  14  constrained  subproblems  are 


Vl'Vl'Vl’1  111  -  “in  (  Vl  +  Vzn!t) 

Vi  ) 

ZN  6  V«  ) 


for  t  -  1 , 2 , . . . ,  *  14  . 

N 


“Substituting 


^J-l  “  ZN 


=  z  -  a  (1)  xv 


=  ZN  “  XN-1 


in  (9.66)  we  obtain 


VilVi'Vi-1l*1  =  "li"  :  ... 

z»  6  V11 


ZN  -  2ZH  XN-1  +  Vl 


+  VV« 


Each  of  the  subproblems  in  (9.66)  can  be  solved  analytically.  The 
extreme-piece  subproblems  (i.e.,  t=l  and  t=14)  have  optimal  expected 
costs  that  cure  piecewise-quadratic  in  xN  ^  with  two  pieces: 


VN-l(XN-l'rN-l  =1'1} 


-  -2xn-i  +  750-2 


v.1'?  =  X.2  .  +  6x  .  .  + 


if  Vl  <  '3*75 


761.5  if  x  _  >  A  _ ( 


* 

l/R 

Vi  "  *Vi  - 


if  Vi  >  Vi(1 


Vl^Vl'Vl*1!14 


VN-IL  =  *2-1  -  6XN-1  +  761*5  if  Vl  <  3.75 


v“i°  -  -2Vi  +  750-2 


i£  Vl  >  Vl114 


Vl  (Vl' Vl=1  !14 


14, U 

Vi 


’  -Vl  +  3+ 

if 

Vl 

<  9  . 

N-l 

=  -.2x 

if 

x..  , 

>  9 

N-l 

N-l 

N-l 

A 

/V 

vJ'J  and  ^4[U  are  quadratic  in  because  and  Vv14) 

quadratic  in  zN- 

The  other  subproblems  in  (9.66)  for  this  example  have  a  three-piece 
solution  structure: 


j 

(vt'L 

N-l 

‘Vi'11 

if  Vi  ~  Vi(t) 

Vi(vi'rN-r1it)  =  ] 

N-l 

‘Vl-11 

if  Vi(t)<  Vl 

j 

vfc'R 

N-l 

'Vl'1' 

if  Vl(t)  i  Vl 

We  can  find  the  actively  constrained  cost  pieces  VN11(XN_1'1^  311(3 
V*'?  (x  ,1)  in  (9.69)  quite  easily: 

N—l  N— 1 

•  For  t  =  2,  •  • . ,  tl>  =  14  find  (xx,  .,1)  by  evaluating 

N  N-l  N-i 

A 

(9.68)  with  z  =  Y  (t-1)  . 

N  N 

A  t  R 

•  For  t  =  1,  —  ,  find  vNli  (x^_i ,  1)  by  evaluating 

A 

(9.68)  with  z  =  y  (t)  . 

N  N 


Following  these  steps  for  this  example  we  obtain: 
2 


1,R 


V, 


N-l 
2 ,  R 


=  V. 


2  /  L 


N-l 


N-l 

3,R 


=  v3'L 

N-l 


N-l 

,4,R 


V, 


4,L 


N-l 


N-l 


=  v5'L 

N-l 


N-l 


,6,R 


N-l 

7,L 


N-l 

,7,R 


N-l 

,8,R 


=  V, 


N-l 

8,L 


N-l 


N-l 

,9,R 


N-l 


N-l 
,10,  R 


v10,1- 

N-l 


N-l 
,11,  R 


N-l 

12, L 


v; 


N-l 
12,  R 


N-l 


=  V, 


N-l 
13, L 


v!3,R 


N-l 
14, L 


XN-1  +  6Vl  +  761  • 5 


XN-1  +  5XN-1  +  726-8 


XN-1  +  4Vl  +  692,7 


XN-1  +  3Vl  +  596,8 


5,R  =  V6,L  = 


XN-1  +  2XN-1  +  501,5 


Vl  +  Vl  +  438,0 


»v9'L  = 


N-l 
2 


+  375.3 


XN-1  -  Vl  +  438,0 


Vi  "  2Vi  +  501,5 


=  vn'L  = 


Vi  "  3Vi  +  596,8 


Vi  "  4xn-i  + 692,7 


Vi  "  5Vi  +  726,8 


=  X  .  -  6x  .  +761.5 


(9.70) 


From  (9.69)  we  see  that  the  actively-constrained  cost-pieces 

(VN-1  ***  VN-1}  °f  VN-l(XN-l'rN-l=1^t)  Wil1  always  be  <Iuadratic 

2  10 

m  (with  xN_1,  x^^,  xn-i  terms^  regardless  of  the  form  of 


W*1 


,t,u 


The  unconstrained  costs  V  ' .  are  not  quadratic  in  x  ,  in  general. 

N-l  -  n  N-i 


The  difficulty  in  solving  for  nonquadratic  may  necessitate  the 


use  of  an  approximation  to  vn-1  ^-1'  rN-1  ^ ^  "  0ne  sucb  approximation 


is  as  follows: 


For  t=2 , 3 , . . .  ,1^-1  -  13,  each  subproblem's  optimal  cost 

V  ,  (x„  i  »r»,  ,=l|t)  is  bounded  above  at  every  x„  ,  value  as 
N-l  N-l  N-l  '  N-l 

follows : 


Vi'Vi'Vi"1!*1  *  j 

|  N-l 

‘Vr11 

if 

Vi  1 

L  t,R 
N-l 

Vr11 

if 

XN-1  - 

where  we  define 


Note  that 


9N-l(t)  <  Vl(t)  <  Vl(t) 


(9.72) 


Vi(t)  4  Vi  value  where  tl'Vl'11 


and  V  (x„  , ,1)  intersect  , 
N-l  N-l 


(9.73) 


Using  these  upper  bounds  on  the  subproblem  optimal  costs,  a  suboptimal 
approximation  of  the  optimal  expected  cost-to-go. 


VN-l(XN-l'rN-l=1)  =  "  ^VN-1  (XN-l'rN-l=1lt)  ^  ' 


(9.74) 


can  be  obtained  by  performing  the  following  comparison  at  each  x^. 
value : 


Vl(Vl'rN-lSl)  =  “  Vl(XN-l'rN-l=1l1)'VN-l(Vl'Vl=1l14) 


{Vi!t)  ;  t=2'---'Vi=13} 


(9.75) 
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This  approximate  controller  involves  the  comparison  of  cost  functions 


that  are  all  piecewise-quadratic  in  x^_^.  This  comparison  can  therefore 

be  carried  out  using  the  JLPQ  algorithm  of  chapter  8.  The  resulting 
suboptimal  controller  has  an  expected  cost-to-go  that  is  piecewise- 
quadratic  in  and  a  control  law  that  is  piecewise-linear . 

The  suboptimal  controller  (9.73),  (9.75)  can  be  interpreted  as 

follows:  for  each  x  .  value  we  either 

N-l 


•  use  the  left-endpiece  of  the  optimal  controller, 
Le 


Vi(x-  ’'1) 


N-l 


ii'Vr11 


or 

•  hedge  to  one  of  the  >1  =  13  joining  points  of  the  z-conditional 

a  N  —  1 

cost  V  ,(z.  ,|r  =1)  ;  that  is,  we  hedge  one  of  the  y  (t) 

N  N  1  N-l  N 

or 


#  use  the  right-endpiece  of  the  optimal  controller 

,1)  • 


Re  , 

Vl  (X 


14, U, 

N-l'15  =  UN-1  (XN-1' 


For  this  example  the  intersections  (t)  of  (9.73)  are  as  follows: 


Performing  the  comparison  in  (9.75)  we  find  that  the  approximate 
controller  is  given  by 


Vl(Vl,rN-lSl) 


.1,U 

■  •2xh-1  + 

750.2  if 

XN-1  " 

-21.648 

,7  ,R  2 

^  -  4.x 

+  375.3  if 

-21.648 

<  x 

N-l 

,14,  U 

-  -2Vi  - 

750.2  if 

21.648 

<  X 

N-l 

1,0  „ 

■  ‘-2Vi 

if 

x  < 

N-l 

21.648 

'-J 

** 

V 

II 

8,L 

u  *  ”XN-1 

if 

-21.648 

<  x  c 

XN-1 

14, U 

=  --2Vi 

if 

21.648 

A 

1 

H* 

(9.76) 


This  suboptimal  controller  has  three  obvious  advantageous  properties: 


(1)  it  is  obtained  without  computing  the  inactive-constraint 

costs  vt,U  for  t  /  1,  . 

N 


(2)  All  of  the  comparisons  in  (9.75)  are  between  quadratic 
functions,  so  they  cam  be  easily  obtained  using  the 
quadratic  formula. 

(3)  The  resulting  controller  is  piecewise-linear  in  x^,  which 
facilitates  implementation. 


To  investigate  the  accuracy  of  this  approximate  controller  for  this 
example,  let  us  return  to  the  optimal  solution  derivation.  The  inactive- 
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constraint  cost  pieces  V  '  in  (9.69),  when  they  are  valid,  are  not 


quadratic  in  (in  general) ,  so  they  are  more  difficult  to  determine 

than  the  active  constraint  costs.  We  will  demonstrate  how  they  can  be 
obtained  by  considering  the  subproblem 


VN-l(XN-l'rN-l-1l7) 


-min  Vl+W7> 

Vi  < 


- .  5<z  <0 
N 


(9.‘ 


in  detail.  From  (9.65)  we  can  rewrite  (9.77)  as 


Vi(vi'Vi=1i71  - 


=  min 


(.0416667)  z l  +  (1.375)z2 


z  6  (-.5,0) 

N 


N 


N 


-(124.875  +  2xn_1)zn 
2 


(9. 


+  375.298  +  x. 


N— 1 


Differentiating  with  respect  to  zN  we  have 


^N-i  ^XN-l'rN-l~^  7^  2 

- 1  -  --37—  -  =  (-125)z;  +  (2.75)zn  -  (124.875  +  2x^1 


N 


3  VN-l(XN-l'rN-l*1l7> 


<3V‘ 


,25z„  ,  +2.75 

N— 1 


(9.7' 

(9.E 


From  (9.80)  we  see  that 


3  Vi(Vi'Vi=1l7) 


>  0  for  z  6  A  ( 7 )  =  (-.5,0) 


(9.E 


Setting  (9.79)  to  zero  and  solving  for  zN,  we  obtain  the  stationary 
points  of  Vl(Vl'rN-l=l|7): 

z_  *  -11  ±  v/ll20  +  16x  (9.82 

N  N-l 

if  and  only  if  x„  ,  >  -70  . 

If  x  ,  <  -70,  there  are  no  stationary  points.  From  (9.79)  we  see 
N-l 

that  if  Xjj  ^  <  -70  then 


3VN-l(XN-l'rN-l~1l7) 


hence  the  minimizing  zN  in  (9.78)  is  on  the  left  boundary  of  AN(7) 

*  + 

(i.e.,  at  zN  *  YN(6)  *  -.5  )  when  x^^  <  -70. 

From  (9.81)  we  have  that  the  optimal  zN  value  in  (9.78)  is  given 
by 

ZN  =  -11  +  v'liao  +  16xn_1 


A 

if  this  z^  is,  in  fact,  in  AN(7)  =  (-«i»0)  (and  x^_^  ^  -70).  That  is,  if 


-63.09375  <  x„  _  <  62.4375  . 

—  N—l  — 

A 

When  x„  ,  <  -63.04375,  the  minimizing  z„  in  A.  (7)  is  on  the  left 
N-l  N  N 

boundary : 
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>  -62.4375  then 


Following  this  optimization  procedure  for  each  subproblem  in 
(9.68)  we  obtain  the  subproblem  solution  joining  points  9  ,  (t)  , 

the  costs  vJ'J  (xN_1»l)  and  the  controls  u^^(xN_1,l)  as 
in  (9.69).  These  are 
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+  750.2 


_  x2  :+  44 . OOx  ,  +  2376.  +  [-61.67  -  1.333xM  ]  /1480  +  32x 

-  *  N-l  N-X  W  X 

2  , _ — 

■  XN-1-  4-509xN_i  +  422-4  +  [  56*29  +  1-833xn-11  /“69*14  ‘  2'286xN-l 

2  , _ 

=  -  20.67xn_1  -  1564.  +  [118.8  +  1.333x^1  /-1900.  -  21.33xN_L 

=  XN-1  -  4.754xN-1  -  130.3  +  [123.8  +  1.333x^1  /-495.1  -  5.333xN_1 

2  _ _ 

~  XN-1  -  3.929xn_x  +  134.4  +  [41.22  +  1.333x^1  Z-565.3  -  4.571xN_1 

2  , _ 

3  XN-1  +  22xn  +  1860.  +  [-93.33  -  1.333x^1  /1120  +  16xN_1 

2  , - 

=  Vl  -  22xn_l  +  I860.  +  [-93.33  +  1.333x^1  /1120  - 

2  , - 

=  XN-1  +  3.929xN_L  +  134.4  +  [41.22  -  1.333x^1  Z-565.3  +  4.571xN_1 

l  2  _ 

=  XN-1  +  4 * 754xjj_!  -  130-3  +  U-23.8  -  1.333x^3  /-495.1  +  S^Sx^ 

2  , _ 

3  XN-1  +  20.67xn_l  -  1564.  +  [118.8  -  1.333x^3  /-1900  +  21.33xN_1 

2  , _ 

3  XN-1  +  4-509xn_!  +  422 ' 4  +  t56-29  ~  1’833xn-1]  /-69*14  +  2-286xn_i 

2  , _ 

3  *N-1  -44.00xN_1  +  2376.  +  [-61.67  +  1.333x^1  /1480  -  32xN_1 

=  5v2  +  750.2  la  oil 


I  ,  0„  .  (t)  as  li 
N-l 


Vi(t) 


-34.9688 

-38.3531 

-96.2037 

-95.6494 

125.577 

-63.0938 

62.4375 

124.983 

94.9694 

95.6321 

36.4484 

34.3672 

3.75 


-  3.75 

-  34.3672 

-  36.4484 

-  95.6370 

-  94.9694 
-124.983 

-  62.4375 
63.0938 

125.577 

95.6494 

96.1968 

38.3531 

34.9688 


Table  9.3:  Joining  Points  of  Constrained 


Subproblem  Solutions 
VN-l(XN-l'rN-l=1^t)  °f  ExamPlrj 


Performing  the  minimization  in  (9.74)  we  obtain  the  optimal  expected 
cost-to-go 


VN-l(XN-l'rN-l-1) 


ru 


Vi(Vi'Vi=1) 


ZN(XN-l'rN-l=1)  =  '  2 


,1,U 

•2Vi  + 

750.2 

if 

XN-1  1 

-21.6- 

-7'R  =  v8'L  = 

Vi  +  375  • 3 

if 

-21.6 

1  V 

t14,U 

•2Vi  + 

750.2 

if 

21.6- 

IVl 

t1'0 

'•2Vi 

if 

XN-1  < 

-21.6 

7 ,U  8,L 
l  =u  = 

~XN-1 

if 

-21,6 

<  V 

14, U 

'•2xn-i 

if 

21.6 

<  Vl 

1,U 

,8Vi 

if 

Vi"  ■ 

■21.6 

7,R  8,L  = 

■  z 

0 

if 

-21.6 

<  XN- 

14, U 

•8Vl 

if 

ro 

H* 

0> 

<  XN-1 

(9.85) 


(9.86) 


(9.87) 


This  optimal  controller  (9.85)  -  (9.97)  is  identical  to  the  approximate 
controller  (9.75)  for  this  example  problem.  That  is,  at  no  point  other 

than  the  endpieces  are  any  unconstrained  costs  optimal. 

Comparing  the  solution  to  example  9.3  with  that  of  example  8  4 

(same  problem  but  without  noise)  we  note  that  the  optimal  controller 
in  the  noisy  case  is  simpler  than  in  the  noiseless  case.  In  example  8.4 


1 


graphically,  or  by  finding  the  intersection  of  all  of  the  subproblem 
solutions. 


(figure  8.4),  VlVl'Vl"11  had  Vl(1)  =  9  pieces-  In  example  9.3, 

V  .  (x  ,  ,  ,r  =1)  has  only  mM-i^  *  3  pieces.  The  presence  of  additive 
N- 1  N- 1  N- 1  1 

input  noise  in  example  9.3  makes  many  of  the  optimal  strategies  of 

example  8.4  (in  particular  hedging  to  x„  =  -1+,  -5,  .5,  1~)  impossible. 

N 

We  note  that  the  optimal  control  laws  in  (9.86)  are  the  same  as  the 
endpiece  and  middlepiece  control  laws  of  example  8.4  (see  table  8.4). 

The  resulting  optimal  expected  cost  is,  of  course,  higher  for  example  9.3 
because  of  the  added  uncertainty  caused  by  the  input  noise. 

In  this  section  we  have  obtained  the  solution  of  an  example  problem 

A 

A 

having  nonquadratic  V  (z  Jr  )  pieces.  In  the  process  we  also  developed 
an  approximate  controller  (which,  for  this  example,  yields  the  true 
optimal) ,  In  the  next  section  we  will  derive  a  general  one-step 
solution  to  JLPC  problems  described  by  (9.1)  -  (9.16),  patterned  after 
the  solution  of  example  9.3.  The  investigation  of  approximate  solutions 
will  be  continued  in  section  9.9. 


9.5  One  Stage  Solution  of  the  Noisy  JLPC  Problem 

In  this  section  we  develop  a  formal  procedure  for  the  solution 
of  the  noisy  JLPC  problem  that  was  formulated  in  section  9.2.  We 
begin  by  presenting  a  proposition  which  describes  the  one-stage 
solution.  The  proof  of  this  result  is  constructive;  it  is  essentially 


a  formalization  of  the  solution  technique  applied  to  examples  9.1  and  9.3 
The  one-stage  solution  can  be  applied  inductively  (  backwards  in  time  from 
finite  terminal  time  N)  to  solve  (9.1)  -  (9.16)  at  each  time  stage. 


The  one-stage  solution  result  is  as  follows: 


Proposition  9.1: 

Consider  a  noisy  JLPC  problem  as  in  (9.1)  -  (9.16).  If  at  time 

k  =  l  +  1  the  following  three  statements  are  true  for  each 

r  *  i  6  M,  then  they  are  also  true  at  time  k  =  l  for  each  r.  =  j  S  M 
£+1  J  —  l  - 

consists  of  m^  (j)  pieces  joined  continuously  at 
{<sj(l)  <  S£(2X,..,  <  sj  (m^j)-  1)}  : 

Vk(Xk'rk=j)  =  Vk(xk?t)  f°r  5k(t"1)  <  \  <  6k(t) 

t  =  1, . . .  ,mJc(  j )  (9. 

(here  <$j(0)  4  -  <5  j(m(  j) )  4  »  ). 


(i)  Wrk“j) 


(ii)  Over  its  domain  (6^(t-l),  <5^(t)),  each  piece  V^(x^;t) 
is  twice  continuously  differentiable  in  jr,  with  either 


or 


or 


(3«kl2 


>  0 


a2vk(> k,t:1 

<3V2 


<  0 


(S^)2 

throughout  6^(t-l)  <  x.  <  5^(t). 


(9. 


That  is,  over  each  piece  (x^>r^=j )  is  either  everywhere 
convex  or  everywhere  concave. 

(iii)  At  each  joining  point  5^(t)  (t=l, . . . ( j) -1)  either 


a  Wy*1 


is  continuous 


or  it  decreases  discontinuously. 


This  proposition  is  a  generalization  of  the  JLPQ  and  JLQ  one  step 

solutions  (Propositions  5.1  and  8.1).  If  we  think  of  the  x-operating 

cost  at  time  k  =  N  as  the  sum  of  Q(x  :  r  )  ind  Q(x  „;rj  (and  thus  think  of 

N  N  T  N  N 

V  (x r  )  =  0)  ,  then  conditions  (i)  -  (iv)  are  met  at  k  =  N.  This 
N  N  N 

proposition  can  then  be  applied  inductively  to  solve  (9.1)  -  (9.16) 
at  each  time  stage. 


Proof  of  Proposition  9.1; 

This  proposition  is  proven  in  a  constructive  manner,  similar  to 
the  proofs  of  Proposition  5.1  (JLQ  one  step  solution)  and  Proposition  8.1 
(JLPQ  one  step  solution) .  We  will  sketch  the  proof  here;  details 
appear  in  appendix  D,4 . 

For  each  form  r^  ®  j  6  H  the  minimization  in  (9.17)  is  converted 
into  the  comparison  of  a  finite  set  of  constrained  in  z^+^  sub¬ 
problems,  where  is  as  defined  in  (9.50)  -  (9.53).  These  are 

then  solved  and  compared  at  each  x^  to  obtain 
done  via  the  following  steps; 


k(xk'rk=:j)’  This  is 
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UNCLASSIFIED 


NL 


STEP  Is 


STEP  2: 


Obtain  a  composite  partition  of  xk+1  values  from  the 
partitions  associated  with  the  x-costs,  Q(x  ,r  =i)  , 

~  KtI  JCtJ. 

the  form  transition  probabilities  p(j,i:x^+^)  and  the 
expected  costs-to-go  vk+]_  '  rk+1=  ^  for  each  i  6  . 

This  partition 

Wti111  ■  'ii11-11'  Clu  ‘  t=1 . *k+i» 

is  obtained  as  described  in  section  9.3  (in  (9.22)  -  (9.24)). 
This  step  is  the  same  as  for  the  noiseless  JLPQ  problems 
of  chapter  8. 

A 

Obtaining  v^lylyi). 

For  each  xk+1  interval,  Ak+1  ,  compute  the  conditional 

A 

expected  cost-to-go  V]c+]_(x]t+ilr]c3'j)  by 


vk+i<:wt)  ■  .L 


WVl'Vl-il 


obtaining 


(9.90) 


Vk+1 (xk+l I rk= j 5  =  Vk+l(xk+l;t) 


for  Xk+1  6  Ak+l(t) 

t=l,...,^+1  -  1 


(9.91) 


Since  p(j,i;xk+1),  Vk+1(xk+1,rk+1=i)  and  QU^.r^-i)  are 
each  twice  continuously  differentiable  in  *k+^  except  at 
a  finite  number  of  points  (for  each  i  e  C^), 
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STEP  3: 


WVliVj)  also  tv”-ce  continuously  differentiable 


except  at  finitely  many  points.  Each  piece  'W’Wt)  i: 


twice  continuously  differentiable  in  over  A^+^(t). 


Obtaining  Vk+^(z^+^  | rjc=* j >  and  a  partition  of  zk+1  values: 


We  now  have 


VW3>  *  "in  {\  R(3)  +  E  {vic.i<xk.i|ck’3),)  ■ 

\ 


We  seek  to  reformulate  (9.92)  as 


yw3)  ■  (w v3^! 

t-1'  •••<+! 


where 


WVJlt)  4  WVjIVi  6  ^+Ilt>> 


mn 

“k  s-t- 


(uk  R(3)  +  Vi(VilV3,) 


Vi  e  ^ku1'1 


and 


zk+i "  a(rK  +  b(rk)uk  -  vr  5(rk)vk  •  (! 

Here 


'k+lv“k+l'‘k 


k+i(VilV3) 

rk 55  j' 

[ 

V  \ 

(< 


We  claim  that  the  z-conditional  cost  V^+^(z^+^  ^“j)  has  a 


piecewise  structure  in  z  .. . 


J.-.- 


Specifically, 


Vi(zk+i|rkaj)  -  vk+i(zk+iJt)  for  2 


6  Ar  , ,  (t) 


k+1  Tc+1 


(t  /  \+1) 


:9.97) 


where  the  z,  ,  intervals  in  (9.94),  (9.97)  are 
k+1 


forming  a  partition  of  the  real  line  with 

-00  £  Y^(0)  <  yLi(D  <•••  <  Y^+1(i^+1-D  <  yLt  (iLi )  = 


k+1  k+1 


(9.98) 


We  can  express  the  z-conditional  cost  in  (9.96)  as 


Vk+l(Zk+l  rk=j)  mJ  P(V)  \+l(Zk+l  +  =^)vlrk=3)dv  . 


(9.99) 


Recall  from  section  9.2  that  the  noise  density  p(v)  has  a 
piecewise  structure  with  a  pieces.  We  can  rewrite  (9.99)  to 


reflect  this: 


Vk+l<Zk+l'rk“j) 


_  a(s) 

l  f  -«v* 

3-1  ~  o (s—l) 


;)  Vk+l(zk+l  +  =<j)v|rk=j)dv 


(9.100) 


Incorporating  the  piecewise-structure  of  vk+1 (xk+1 I rk=j ) 


in  (9.100)  we  have 


644 


“max [a  (s-1)  ,mintY^+1U)-zk+1»o  (s)  ]  ] 


10  (V ;  s)  Vj^+1  (zk+1+v;  1)  dv 


a(s-l) 


^k+l"1  max[0  (s_1)  »®in[Yk+1<t)**k+1*0  (s)  1 1 


k+l(zk+ll  rk“j) 


E/ 


to  (v; s)  v^+1(zk+1  +  Vjt)dv 


min[a  (s)  ,max[Y^+1(t-l)  -  zk+1  ,a  (s-1)  ]  ] 


r  *j 

J  w(v?s)  Vk+l(zk+l  +  v; 


<+l)dv 


min[a(s)  ,max[Yk+1(^k+1-l)  -  z  ^cr  (s-1)  ]  ] 


From  (9.101)  we  see  that  for  each  value  of  zk+1»  the 


(9.101) 


z-conditional  cost  is  a  sum  of 


0  *iUi 


integrals.  The  numerical  values  of  the  limits  of  integration 
in  (9.101)  Spends  upon  the  value  of  z^.  xhis  gives 

A 

A 

Vk+1  (*k+1l  rk*j)  a  piecewise  structure  in  zk+1  *• 


We  have  a  zk+1  partition  boundary,  Yk+1(t),  at 
each  zk+^  value  where  one  (or  more)  of  the  limits 
of  integration  in  (9.101)  changes. 


Since  k>(v;s)  is  twice  continuously  differentiable  in  v  for 
each  s  over  (a(s-l) ,  c(s))  and  each  Vk+^ (xk+1;l)  twice 
continuously  dif ferentiable  (in  x^^  =  zv+i  +  over 


(y3  (t-1),  y3  (t)),  it  follows  from  (9.101)  that  each 

k+1 

z-conditional  cost  piece  v^+1(zk+1jt)  is  twice  continuously 
differentiable  (with  respect  to  z^+^)  over  its  domain. 

In  order  to  satisfy  (iii)  of  Proposition  9.1  (i.e., 

(9.89)  and  for  reasons  we  will  discuss  later,  it  is  desirable 
to  have  additional  grid  points  in  the  z^+1  partition: 

We  also  have  a  z^+1  partition  boundary  yk+1(t),  at  each 
zk+^  value  where  the  quantity 


32  viUuk-JV,:i) 

<3w2 


2R(j) 

+  v 

b  (3) 


(9.102) 


changes  sign,  becomes  zero  or  ceases  to  be  zero  and  we  have 
a  partition  boundary  at  each  z^+1  value  where  the 


quantity 


^viU'vi  v3> 

S'  *m.i>2 


changes  sign.  Consequently 


over  each  zk+1  interval  A^+1(t)  =  <Yjj+1(t-l),  yj+1(t)) 

£  . 

the  z-conditional  cost  Piece  v3  (z  ;t)  is  twice 

**T  X  ivr  X 

continuously  differentiable  in  z.  . ,  and  it  is  everywhere 


convex  or 


concave  over  (t) 


STEP  4:  Solving  the  constrained-in-zk+1  subproblems: 

Having  formulated  the  constrained-in-z^  subproblems 

we  must  now  solve  them.  Sustituting  the  definition  of 
zk+1  of  (9.95)  into  (9.94),  these  constrained  subproblems 
become 


6 


zk+l  R{3)  2a(j>  zk+l  xk R(j) 


b2(j) 


b2  ( j ) 


V.  (x,  ,r  »j  It)  -  min  *  .  ...... 

^  *  k  g  a3  (t)  /  +  a  ^+1  /'-i 

Zk+1  6  \+llt)  I  +  - — J£±i -  +  V2  (z.  .  ;t) 

.2,.,  k+1  k+1 

b  (3) 

A  . 

(for  ,4^+1)  . 


(9.103) 


We  can  solve  each  subproblem  analytically,  using  the  basic  approach 
that  was  followed  in  example  9.3. 

The  endpiece  problems  (i.e.,  t»l  and  will  have  two-piece 

optimal  expected  costs: 


Wv3|l)  "  * 


V, 


V*k+l'rk*3  I&+1> 


solution  structure: 


Vw^1 


with 


u  <v*> 

if  a(j)  x^,  _<  02  (1) 

R 

/w 

if  a(j)  x^  >  ©j(l) 

(9.104) 

*k-H'L  , 

vk  ‘V3’ 

J*+1>u 

vk  <V3> 

i£  *k  i  6l  '♦Li1 

i£  #k  i  a,3>  \ 

• 

(9.105) 

in  (9.103)  will 

have  either  a  three-piece 

vk'L(V^ 

if  a(j)  x^  <  ej(t) 

V£'U(V2> 

if  9^(t)  <  *d')\  1  0jj(t) 

vk'R(v2> 

if  ©j^lt)  <  a( j)  x^  . 

ej(t)  <  el 

(t) 

(9.106) 
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or  they  will  have  a  two-piece  solution  structure 


(  Vk>L(xk'j)  if  a(j)  Xk  -  ®k(t)  =  Qk(t) 

VWj|t)  *  | 

(  V  (xk,j>  if  a{j)  *k  -  ek(t)  =  ek{t)  ' 

(9.107) 

As  in  chapters  5  and  8,  the  superscripts  L,R  and  U  correspond,  respectively 
te  driving  to  the  left  endpoint,  the  right- endpoint,  or  the  interior 

of  the  region 


aLi‘c>  *  • 


As  we  indicated  in  the  solution  of  example  9.3,  the  actively-constrained 


t  L  t  R 

costs  V  '  (x^,j)  ^  V  '  (x^,j)  are  quadratic  in  x^.  Direct  substitution 


in  (9.103)  yields  the  formulas 


•v..'  •  ■«  -  < 


Vk'“(xk 


2a(j)  R(j)  yj+1(t-l)  ^ 


b2(j) 


a2 


Tk+1 


(t-1) 


b2(j) 


R(i>  +  ii'Vw) 


zk+l'Yic+l(t'l) 


(9.108) 


for  t  -  2, ...  ,i|/ 


k+1 


and 


^2 


+  |  Wt}  R(j) 

-rr — +  vk+i(zk+i5t)  > 


The  cost  functions  (^i))  in  (9.106)  are  not  piecewise-quadratic  rn 

x^/  in  general/  as  we  saw  in  example  9.3.  Recall  that  we  are  consider— 

ing  (9.103)  as  an  optimization  over  z^^.  T^e  inactive-constraint 
solution  z^  to  (9.103)  (if  one  exists)  is  a  zk+1  value  in  the 
x^,  in  general,  as  we  saw  in  example  9.3.  The  inactive-constraint  solution 

z*^  to  ^{x^/T^-j  1 1)  in  (9.103)  (if  one  exists)  is  a  zk+1  value  in  the 

-'•i 

interior  of  A^+^(t)  that  satisfies 


3Vk^'rk"jlt)  2  R ( j) 


2  a  ( j )  R  ( j )  x,  .  . 

- - - —  +  "  K+l^k+l'^' 


(9.110) 


3  vwji 


(3W 


2  R( j )  +  3  Vk+1 (zk+l ;t) 

b2(j)  Ozktl)2  ! 


_  t,u 
Zk+1=ZK+1 


(9.111) 


By  making  each  (z^+^ ; t)  piece  either  concave  or  convex  over  its 

domain  (by  adding  "extra"  points  to  the  z  .  partition) , where  necessary , 

^  ^  t  u 

we  have  insured  that  there  is,  at  most,  one  value  of  z^^  in  the  interval 

A^+i (t) .  It  also  insures  that  the  cost  function  V^'  (x^j)  is  every¬ 
where  convex  or  everywhere  concave  over  (0k(t),  0k(t))  in  (9.106); 
this  is  needed  to  obtain  (iii)  of  Proposition  9.1  (i.e.,  (9.89)). 

A  procedure  for  solving  each  of  the  ij>k+^  subproblems  in  (9.103)  is 

described  in  Appendix  04. 
t,L 

The  Vk  (xk,j),  v£'U(xk,j)  and  Vk'R(xk,j)  in  (9.104)  -  (9.107) 
possess  similar  properties  at  0k(t)  and  0k(t)  to  the  analogous 
quantities  in  chapters  5  and  8.  in  particular,  we  have  the  following: 


6 


when  (9.105)  or  (9.106)  applies,  at  x^  =*  9^(t)/a(j)  the  slopes 
and  values  of  v£'L  (x^/j)  and  v£'U(x^,j)  are  the  same, 


when  (9.104)  or  (9.106)  applies,  at  x  =  0j"(t)/a(j)  the  slopes 

jC  & 

and  values  of  V^'R(x^,j)  and  V^'U  (x^,:))  are  tiie  same, 


when  (9.107)  applies  (i.e.,  there  is  no  V^'U  )  then  at 
^  »  9^(t)/a(j)  =  0^(t)/a(j),  the  value  of  V^'^x^j)  and 
V^,L(x^,j)  are  the  same  but 


3V, 


,t,L 


‘V3) 


8*k 


\  - 


3llt> 

a(j) 


4it) 

a(  j) 


,t,R 


3Y  (V^ 


3Xk 


=  e>) 

=  a(j) 


S^lO 

a(j) 

(9.112) 


STEP  5;  Comparing  the  Constrained  Costs ; 

The  fifth  step  in  this  proof  of  Proposition  9.1  is  to  compare 
Aj 

the  solutions  of  the  constrained  problems  specified  by  (9.103)  , 

as  indicated  in  (9.93).  This  minimization  involves  the  comparison  of 
piecewise  functions  in  x^  (with  structures  as  given  in  (9.104)  -  (9.107)). 
Since  these  function  pieces  are  not  all  quadratic  in  x^,  this  comparson 
is  much  more  difficult  (in  general)  than  for  the  JLQ  and  JLPQ  problems. 
We  choose  V^(x^,r^*j)  at  each  x^  value  to  be  the  candidate  function  in 
(9.93)  having  the  least  value.  Thus  V^fx^r^j)  has  the  piecewise 
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structure  described  by  (i)  of  the  Proposition.  As  we  mentioned  earlier 


we  partitioned  the  axis  of  values  so  that  (ii)  of  the  Proposition 
is  satisfied  by  each  V^'^x^ij)  (and  for  each  V^'L  (x^/j),  V^'R(x^,j) 
we  see  from  (9.108)  -  (9.109)  that  (ii)  is  satisfied). 


The  fact  that  3Vk ^xk,rk=^ ^ 

3xk 


is  either  continuous  or  decreases 


discontinuously  follows  directly  from  the  comparison  in  (9.93).  A 
joining  point  <5^  ( Jl)  cam  arise  either  from  a  crossing  of  candidates  (here 
the  slope  decreases  discontinuously) ,  or  from  a  change  between  parts 
of  a  subproblem  solution  V^x^r^j  |t)  (i.e.,  from  v^'L  to  V^'U  in 

(9.105),  (9.106),  or  from  V^'L  to  v£'R  in  (9.107),  or  from  v£,U  to 
t  R 

V^'  in  (9.104),  (9.105);  in  these  cases  the  slope  of  V^(x^,rk=j)  is 
continuous  or  it  decreases  discontinuously  at  6?  (2,)  . 

'  This  concludes  the  proof  of  the  one-stage  solution  given  by 

!  Proposition  9.1. 


ualitative  Properties  of  the  Optimal  JLPC  Controller 


In  this  section  we  examine  several  qualitative  issues  related 
to  the  (off-line)  determination  of  the  optimal  control  laws  and 
costs  of  Proposition  9.1.  The  results  of  this  examination  are  used 
in  the  next  section  to  devise  an  algorithm  for  the  efficient 
computation  of  the  optimal  controller. 

We  begin  with  a  description  of  the  subproblem  solution  in 
(9.94)  -  (9.97) .  Some  of  the  properties  that  are  listed  in  the 
following  proposition  were  mentioned  in  the  preceding  section. 


Proposition  9.2:  Consider  the  constrained  subproblem  of  finding 


u^  satisfying 

Vk<*k'rk“^lt>  ’  “dn 

V-* 


{\  B«>  tVk+ll2k+l’t)> 


2k+l«  ^.llt> 


(9.113) 


zk+l  *  *<J)  *k  +  b(3)  “k 


(9.114) 


*k+l(t>  '  "kU1'-11  ' 


R(j)  >  0  a(j)  t  0  b(j)  j*  0 


The  subproblem  solutions  possess  the  following  properties; 


For  t  =  2,. ,, ,  \p  ' 


if  a(j)  x^  <  0^(t)  then  the  minimizing  in  (9.113)  is 
given  by  the  control  law 

t/L  .  ..  Yk+itt-D  -  *(J)*k 

\  m  “k  (W3)  -  — bin - 


with  the  resulting  z^+^  value 


zk+l  ~  zk+l  (xk,rk=:i)  =  [Yk+l(t"1)] 


Vk  "  V  (xk'rk“j)  - 


W*”1*  *  a(^)xk  V 


k+1  ' k+1 


IrL  (t-l)]  +  ;t  , 


which  is  quadratic  in  x^. 


For  c  - 1 . Vi  - 11 

if  a(j)xk  >  0^(t)  then  the  minimizing  u^  in  (9.113)  is 
given  by  the  control  law 

t.R.  Yk+l(t)  '  a<3)!tk 

\  "  \  (V3)  '  - ETT) - 


with  the  resulting  zfc+1  value 


=  ' R < J'v.  - r  - j )  -  tYjj  ,  (t)  ] 

k+1  k+1  k  k  k+1 


'!k+lIU  -  a( 


..  „t,R.  ..  (Yk+Ilt>  -  al3)xk^. 

vk  ■  \  (V3)  ■  V— STi) - /  3 

*  ''iUi^k+i  <«>'*')  • 

which  is  quadratic  in  x^. 


(9.115) 


(9.116) 


(9.117) 


(9.118) 


(9.119) 


(9.120) 


3.  For  t  =  2 


1 


k+l 


if 


32V^  (z  • t ) 

t  SU)  <  0 

‘V1  b  !3) 


(9.121) 


for  all  zk+1  e  A? 


*£+!<*> 


(i)  e^(t)  =  e^{t)  =  j 


then 


YLilt-1)  ♦  Yk+i<« 


+  *Oi> 


R(j) 


vLiII1'k'i*Jit>  -  vLdvLit)) 


k+l  'k+l 


f;t3 


(ii)  At  x 


9k(t)  =  0k(t) 

a(j)  aTj) 


(9.122) 


vk'R<v31 


vk,L(V3> 


\  * 


9k(t> 

a(j) 


9k(t> 

a(j) 


(9.123) 


find 


3V, 


t,R 


lv3> 


9k(ti 

a(j) 


2a(j)R(j) 

b2(j) 


.6k(t)  -  YL(t-» 


2...  [9k(t>  -Yk.l(t>] 


<  2a(j)R(j> 
b2(j) 


3V, 


t,L 


(Vj) 


3x, 


(9.124) 


9kft) 
a(  j) 
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* 


X  ft 


(iv)  For  t  ■  2, . . . , 
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l!v/ 


<  0  for  all  zktl  e  4^+1(t) 


(9.140) 


{3v2 


<  0  for  all  0^(t)  <  afjjx^  <  0jj(t) 


(9.141) 


•  For  the  extreme  cases  t  *  1  and  t  .»  either  (9.136)  - 


(9.137)  or  (9.138)  -  (9.139)  applies. 


This  proposition  is  proved  in  appendix  D.5.  It  says  that  each 
subsystem  optimal  cost  V^x^r^sjl  t)  in  (9.113)  has  a  two  or  three 
part  structure.  Note  that  we  have  constructed  the  partition  of 

A  . 

(i.e.,  the  grid  points  {Y^+^(t)})  so  that  for  each  t,  only 
one  of  the  conditions  (9.121),  (9.136),  (9.138)  or  (9.140)  applies 
over  the  entire  interval  (t) . 

1  U 

The  unconstrained  cost  (x^,rk=j),  which  corresponds  to 

A  . 

driving  z^+1  into  the  leftmost  z^+1  interval,  A^+1(l),  has  the 

two-part  structure  shown  in  figure  9.3(a).  The  actively  constrained 
1  R 

piece  V^'  (x^ » j )  is  quadratic  in  x^;  the  unconstrained  piece 
Vk,U(xk,j)  9eneral»  not  quadratic  in  x^.  The  unconstrained 

ij 

yk+l  U 

cost  '  (x^,j),  which  corresponds  to  driving  z^_+^  into  the 
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as  shown  in  figure  9.3(b).  The  unconstrained  extreme  pieces 

A  . 

.  1 

V 

tives  over  the  regions  of  x^  values  where  they  are  valid. 

A  . 

For  t  =  2,  ^  -1,  if  (9.121)  holds  then  Vk(x^,rk=j 1 1)  in 

(9.113)  has  a  piecewise-quadratic  (in  x^)  structure  with  two  pieces, 
as  shown  in  figure  9.4(a).  From  (9.124)  we  see  that  at  their 
joining  point,  the  slope  of  this  subproblem  optimal  cost  decreases 


k+1 

j)  and  V  (x.  ,j)  each  have  nonnegative  second  deriva- 


discontinuously . 

A  . 

For  t  =  2 , . . .  {  with  (9.125)  holding,  V  (x^.r^jjt)  has 

a  three  piece  structure,  as  shown  in  figure  9.4(b).  The  actively 

)  and  v£'R 

in  x^.  The  unconstrained  piece 
However ••  from  (9.137),  (9.139;,  (9.141)  we  see  that  V^'^x^j) 
is  either  convex  or  concave  over  its  entire  domain  of  validity 
(i.e.  for  all  x^  values  where  (x^,rk=j  1 1)  =  V^'U  (x^,j))  •  At 
the  joining  points  the  slope  of  (Xj^r^j  1 1)  is  continuous. 

From  (9.126)  -  (9.127)  we  see  that  it  may  be  quite  difficult 
to  determine  the  unconstrained  control  law  u^'U(x^,j)  .  When 
Vr  . (z;t)  is  quadratic  in  z  there  is  clearly  no  difficulty.  But 

K+  X  /v 

A  . 

for  other  vj^(z;t)  structures,  (9.126)  -  (9.127)  must  be 
simultaneously  solved  to  obtain  u^'U(x^,j)  and  z£,U(x^,j).  It 
is  this  difficulty  that  motivates  the  development  of  a 


(x^,j)  need  not  be  quadratic. 


(x^j)  are  each  quadratic 


constrained  pieces  Vk'  (x^»  j 
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9. 2:  Constrained  subproblem  solutions  for  extreme  pieces  of 
partition;  (a)  V^x^/r^j  |l)  ;  (b)  (x^/r^-j * 


l 


suboptimal  controller  described  later  in  this  chapter. 


We  will  next  show  that  many  of  the  candidate  cost  pieces 
in  (9.104)  -  (9.107)  cannot  be  optimal  from  any  x^;  consequently 
they  need  not  be  calculated. 

The  following  proposition  eliminates  many  of  the  candidate 
costs  in  (9.92)  from  eligibility  for  the  optimal  cost. 


Proposition  9.3;  In  performing  the  minimization  in  (9.92),  the 
following  candidate  costs  need  not  be  examined: 


(i)  if 


3  Vktl(*lH-l,tl  +  2R(i)  >  0 

(3w2  b2«> 


for  all  rfc+1  a  ^(t) 
and 

A  . 

32v^  (z  ;t+l) 

k+ll  k+1-  +  2R( j )  >  Q 

<sw2  b2<3) 

for  all  rktl  a  ^+1(t+D 


and  Vjt+^(z^+^|r^»j)  is  continuous  at  with 


then  we  need  not  examine 


,t,R 


.t+l,L, 


V  {Vj)  S  Vk  (V^ 


(ii)  if  for  all  zJc+1  6  A^+^(t)  we  have 


3  Vk+l(zk+l;t) 

<3zk+l>2 


t  M<0 

b  (j) 


then  V^'U(x^,j)  does  not  exist  but  we  must  examine  vj*'R(x.  ,j) 

Vk'L(V=*>* 


A 

(iii)  if  Vk+l(zk+Jr  k*j )  is  discontinuous  at  zk+1  ■  yk+^(t) 


Vk+1 (zk+l I rk“ j ] 


<  Vi'Vilv1’ 


zk-n’[Yk+i(tn" 


then  we  need  not  examine  v^+^,L^xjc» j) 


(iv) 


if  Vk+l(zk+l'rk’j)  is  discontinuous  at  zk+]_  *  ^(t) 


with  (9. 141) reversed ,  then  we  need  not  examine 

vk  (V;I)' 


and 


with 


This  proposition  is  a  generalization  of  Propositions  5.2  and 
8.2.  It  is  proved  in  appendix  D.6. 


As  we  have  seen  in  the  examples  of  this  chapter,  the  optimal 
controller  hedges  to  certain  values  of  the  artificial  variable 
z  4  The  following  corollary  specifies  hecessary  conditions 
for  a  point  z^+^  *  z  to  which  the  system  hedges. 

Corollary  9.4: 

If  the  optimal  controller  in  Proposition  9.1  hedges  from 
, rk=* j )  to  the  point  z^+1  =  z  then  one  (or  more)  of  the  following 
is  true: 


(1) 


z  is  a  discontinuous  point  of  the  conditional  cost 

A 

A 


Vk+1 (zk+l 


/ 


or 

(2)  z  is  a  boundary  (Y^+1(t)  or  y£+1(t-l)  of  an  interval 

rk“j) 

has 


^+1 


(t)  over  which  the  conditional  cost  v,  J ,  (z.  , , 

k+1  k+1 1 


or 


3  (z_,  ; t) 


k+1 '“k+1 ' 


2R(j) 


(3W 


b2(j) 


<  0 


(9.142) 


-j 

(3)  z  *  (t)  is  a  boundary  of  intervals  A£(t)  and 

KtI  k 


A  .  A 

A^+^(t+l)  where  VVj_1  (zVj-1  |rt=j)  is  continuous  and 


k+1 '“k+1 1  k 


3vJ+1(z,t) 


>  0 


(9.143) 


Proof : 


Hedging-to-a-point  can  occur  only  to  finite  boundary  points 


'N  .  ^  » 

of  the  zk+1  intervals  {A^+^(t)  :  t=l, . . .  ;  that  is,  to  an 

/V  .  A  . 

element  of  the  set{y^+^(t)  :  t=l, . . .  ,4^+^-l)  •  the  optimal 

controller  drives  z^+^  to  such  a  point  from  some  x^,  then  either 

vk  or  V ^  '  is  the  optimal  cost  from  that  x^.  Proposition  9.3 

excludes  many  of  these  constrained  candidate  costs  from  elibility. 
Corollary  9.4  lists  the  possible  ways  that  a  constrained  cost 

V*'R  or  Vj^+^,L  associated  with  (t)  can  be  eligible. 

<  orollary  9.4(1)  occurs  when  either  Proposition  9.3(iii)  or  9.3(iv) 

A 

^  I 

holds.  Here  we  hedge  to  the  low-cost  side  of  a  V^+1 r^=j) 
discontinuity.  Corollary  9.4(2)  occurs  when  Proposition  9. 3(ii) 
holds.  When  (9.142)  is  true,  V^'U  is  not  eligible  but  both 
t  L  t  R 

V^'  abd  V^'  are  eligible  (unless  excluded  by  Proposition  9.3(ii) 
or  iv) .  Corollary  9.4(3)  holds  when  the  slope  condition  of 
Proposition  9.3(i)  is  not  satisfied.  _ 


Note  that  if  one  or  more  of  the  if  one  or  more  of  the  conditions 

A  i 

of  Corollary  9.4  is  satisfied  for  some  z  =  Yj^+1(t),  we  are  not 
guaranteed  that  the  optimal  controller  hedges  to  that  z;  the 
associated  constrained  costs  V^,R  and  neec*  not  be  optimal 

in  (9.92) . 

For  finite  time  horizon  problems,  if  x^  is  negative  enough  or 
positive  enough,  the  optimal  strategy  will  be  to  keep  x  in  the 


same  extreme  piece  of  the  form  transition  probabilities  p(j,i:x) 


and  x-costs  Q(x,j),  QT(x,j),  for  all  i  S  C. ;  from  each  j  e  M, 
for  all  future  times.  The  following  proposition  is  a  generalization 
of  the  JLQ  endpiece  result  (Proposition  6.1)  and  the  JLPQ  endpiece 
result  (Proposition  8.6). 


Proposition  9.5;  (JLPC  endpieces) 

for  x^  <6^(1)  the  optimal  control  laws  and  expected  costs— 
to-go  are 


vw11  =  vi,a  <v3>  -  ^‘v3’ 


VW3)  ■  -  \*<V3) 

(2)  for  x^  >  6^ (m^ ( j ) — i)  the  optimal  control  laws  and 
expected  costs-to-go  are 


..  a  .J*. 


:‘VV3)  ’  \  <V3>  •  V'V3’ 


K*Va,  ..  a  Re 


VW3)  '  \  ‘S’’1  -  \  <V3)  . 


Consequently 
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for  all  x^  <  5^(1) 


SNTlV3> 


<3V 


for  all  xk  >  6^(mk(j)  -  1) # 
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Proof:  (1)  and  (2)  above  are  immediate  generalizations  of  Proposi¬ 


tions  6.1  and  8.6.  Item  (3)  follows  directly  from  Proposition  9.2(v). 

a 

The  following  proposition  lists  a  number  of  general  qualitative 
properties  of  the  optimal  controller  for  the  noisy  JLPC  problems  of ■ 
this  chapter.  This  proposition  is  a  generalization  of  Propositions 
5.3  and  8.4. 


Proposition  9.6:  The  optimal  controller  of  Proposition  9.1  has  the 
following  properties: 

(1)  At  each  time  k  and  in  each  form  jSM,  between  joining 
points  (<5^(t)  :t=l, . . .  /^(j)  -  1}  °f  vk(xk,rk=j)  : 


„  (Xt  r  aj)  a  --bUi—  3Vk(Xk/Vj) 

VVrk  3)  2a( j)R( j) 


2 3Vk(xk'rk 


bJJI 


(9.144) 


(9.145) 


Vl(Xk'Vj)  =  a(j)Xk  -  2a( j) R( j) 

(here  a(j)R(j)  ^  0,  b ( j )  /  0). 

(2)  At  those  joining  points  5  where  the  slope  of  V^x^/t^j) 
does  not  change  lie . ,  — — v__— — 11 - |  l  exists 


/  9Wrk'j> 

\ 

(“■’  3*k 

x  =  6  / 

\ 

k  / 

u^(xk,rk*j)  and  z^+^ (x^,rk=j)  are  continuous  functions  of 


(3)  At  those  joining  points  6^(t)  where  the  slope  of 
(x^ , r^= j )  decreases  discontinuously 


3VW3) 


3Wrk°31 

3xk 


(i)  u^x^r^i)  increases  discontinuously  at  6 

b(j)  / 

when  a(*'  >  0  (and  decreases  discontinuously 

at  6  when  ^-7^!  <  0  1 
aM)  / 


(ii)  the  mapping  x k  I—4  zk+1  (xk*rlc=3)  increases 

discontinuously  at  6  when  a(j)  >  0)  (and  decreases 
discontinuously  at  6  when  a(j)  <  0) . 


(4)  The  mapping 


*k  •— *  Vi(W3) 


has  the  following  properties: 


(i)  the  mapping  is  monotonely  nondecreasina  if 
a(j)  >0  (and  monotonely  nonincreasino  if 
a(j)  <  0)  for  each  jSM 

(ii)  it  consists  of  111^(3)  line  segments: 
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one  line  segment  with  positive  slope  if 


a(j)  >  0  (negative  slope  if  a(j)  <  0)  for 
each  x^  region  where  an  "unconstrained  cost" 
v£,U  (x^,r^=l)  is  optimal 

vw3)  ■  vk'Vvj) 


a  constant  line  segment  for  each  x^  region 
where  there  is  active  hedging-to-a-point: 

.  A  , 


Z 


k+1 


te{i,. 


} 


(iii)  there  are  regions  of  avoidance  associated 

with  (and  only  with)  each  x^  =  6  value  where 
the  slope  of  (x^r^j)  decreases  discontinuously. 


(5)  Each  candidate  linear  control  law  (associated  with  the  costs 
listed  in  (9.92))  can  be  optimal  over,  at  most,  a  single 
interval  of  x^  values.  ^ 

The  proof  of  this  proposition  is  presented  in  appendix  D.7. 

We  have  identified  some  basic  qualitative  properties  of  the 
JLPC  problem  that  can  be  used  to  reduce  the  combinatories  involved 
in  the  "brute-force"  solution  of  the  one-stage  problem  that  was 
presented  in  the  proof  of  Proposition  9.1.  In  the  next  section 
we  will  develop  a  solution  algorithm  that  exploits  these  properties, 
enabling  us  to  solve  the  general  JLPC  problem  (9.1)  -  (9.16)  efficiently 
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An  Algorithm  for  Obtaining  the  Optimal  JLPC  Controller 


In  this  section  we  develop  an  algorithm  for  obtaining  the 
optimal  controller  for  general  JLPC  control  problems  (9.1)  -  (9.16). 
This  algorithm  is  based  upon  the  application  of  the  one-stage 
solution  of  Proposition  9.1  recursively/  backwards  in  time,  for 
each  jSM  that  the  system  can  take.  The  basic  idea  of  the  noisy  JLPC 
problem  solution  algorithm  here  is  the  same  as  in  the  JLQ  solution 
algorithm  of  section  7.2  and  the  JLPQ  algorithm  of  section  8.5. 

For  each  form  j€M  at  time  k,  we  can  compute  -j)  and 

^(x^/r^j)  one  piece  at  £  time,  sweeping  from  left  to  right 
along  the  axis  of  a(j)x^  values. 

The  solution  algorithm  is  presented  in  flowchart  form  and  is 
described  in  detail.  In  principle  it  can  be  applied  at  successive 
time  stages  to  solve  any  JLPC  control  problem  of  the  type  in 
section  9.2.  However,  since  the  optimal  costs  are  not  piecewise 
quadratic  in  x^,  the  analytical  steps  specified  by  this  algorithm 
may  often  be  quite  difficult  to  carry  out. 

An  overview  of  the  solution  algorithm  is  shown  in  figure  9.5. 

The  algorithm  is  initialized  with  the  terminal  time  (k*N)  cost 
parameter  (block  2) .  Then  for  successively  decreasing  time  through 
k*  kQ  (block  13),  the  one-step  solution  of  Proposition  9.1  is 
obtained  for  each  form  jSM  (block  9) . 

In  the  following  discussion  we  refer  to  the  algorithm  flow¬ 
chart  shown  in  figures  9.6  -  9.14.  All  of  the  steps  indicated  in  this 


flowchart  constitute  one  iteration  of  block  9  in  figure  9.5. 


That  is,  they  determine  the  one-stage  JLPQ  solution  that  is  specified 
by  Proposition  9.1  for  some  time  stage  k  and  form  j.  For  the  reader's 
convenience,  a  table  of  block  number  locations  and  entry  points  is 
given  in  table  9.4. 

A  macroscopic  overview  of  the  algorithm  specified  by  this 
flowchart  is  as  follows: 

1.  The  algorithm  is  first  initialized  (in  block  1)  at 
time  N  with  the  terminal  x-cost  Q^txjj)  for  each  j€M  t 

2.  The  determination  of  the  optimal  controller  at  time  k 
for  a  fixed  j  value  constitutes  one  iteration  in  block  10. 
Figure  9.5  differs  from  figure  8.5  only  in  blocks  4  and  10. 

3.  The  computations  of  block  10  begin  in  block  26  with  the 
determination  of  the  composite  x^+1  partition  (block  14) . 
This  partition  is  obtained  exactly  as  for  the  JLPQ  problems 
of  chapter  8  (figures  8.6  and  9.6  are  the  same) . 

4.  In  figure  9.7  we  obtain  the  tentative  z^^  partition 

A  .  ^ 

and  its  associated  v£+1(zk+1lrk=j  •  zk+1  6  A(Jl)),for 
l  =  1,...,\J»  as  described  in  appendix  D.4. 
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e  Number  Block  Numbers  Entry  Points 


1-13 

start  (block  1) 

Stop 

i  (block  12) 

14-26 

from  block  10 

o 

(block  22) 

27-42 

(T)  (block  27) 

© 

(block  33) 

43-61 

(5)  (block  43) 

© 

(block  61) 

62-84 

CD  (block  63) 

<5> 

(block  64) 

© 

(blocks  77,80) 

85-96 

(D  (block  85) 

(block  96) 

97-101 

®  (block  97) 

(block  101) 

© 

(block  101) 

102-108 

0  (block  102) 

© 

(blocks  103,104) 

(S)  (block  108) 

109-118 

©  (block  111) 

© 

(block  118) 

0  (block  109) 

© 

(block  116) 

Table  9»4«  Block  Number  Locations ,  Entry 

Points 

and  Exit 

Points  for  Optimal  JLPC 

Solution 

Algorithm  Flowchart. 

7. 


We  next  prepare  for  the  rightward  sweep  along  the 


a(j)x^  axis  by  obtaining  in  figure  9.10  the  partition 
of  the  real  line  (of  a(j)x^  values)  that  is  caused  by 

•  .  A  , 

the  points  (9^(t),  0^(t-l)  :  t=2 , . . . 

Figure  9.10  is  the  same  as  figure  8.8  (for  the  JLPQ 
problem)  except  for  block  88. 

8.  Initialization  of  the  rightward  sweep  is  completed  in 
figure  9.11,  where  the  endpiece  result  of  Proposition  9.5 
is  applied.  Figure  9.11  is  essentially  the  same  as 
figure  8.9. 

9.  Finally  the  algorithm  performs  the  minimization  in 
(9.93)  over  each  interval  of  a(j)x^,  values  in  the 
8-0  partition,  starting  on  the  left.  This  task, 
shown  in  figures  9.12  -  9.13,  is  identical  to  the  steps 
is  figures  8.10  -  8.11  (In  the  JLPQ  problem)  and  in 
figures  7.5  -  7.6  (for  the  JLQ  problem)  -  except 

for  blocks  111,112. 

As  we  mentioned  previously,  it  may  be  quite  difficult  to  carry  out 
some  of  the  algorithm  steps  for  general  JLPC  problems.  In  particular 
it  may  be  difficult  to  do  the  following: 


5 


(1)  Perform  the  integrations  in  blocks  30,40,41 

32V 

(2)  Solve  for  — —  =0  in  figure  9.8 

Determine  in  blocks  84,63  and  77 

Determine  9^(t)  and  Q^(t)  in  blocks  90,91,92 

Find  the  intersections  specified  in  blocks  102  and  103 

These  difficulties  arise  because  of  the  non-quadratic  structure  of 
V^(x^,r^=j) .  They  will  be  illustrated  in  the  next  section  when  we 
apply  this  optimal  JLPC  algorithm  to  two  time  stages  of  an  example 


problem. 


Figure  9.5:  Algorithm  Overview 


I . 

V- 

a 

h 

h! 

[. 

k' 

k- 

V' 

* 


Figure  9.8:  Algorithm  Flowchart-Part  IV:  Obtaining  the  complete 


^+1  Partition,  as  in  Appendix  D.4. 
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Figure  9.9:  Algorithm  Flowchart-Part  V:  Using  Proposition  9.3 


Ficrure  9.10;  Algorithm  Flowchart-Part  VI :  Obtaining  the  9-0  Grid 
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Figure  9.11:  Algorithm  Flowchart-Part  VII:  End  of  Initialization 
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Figure  9.12:  Algorithm  -  Part  VIII:  Comparisons  Within  a  0-0  interval 


Figure  9.13:  Algorithm  Flowchart-Part  IX:  Moving  Rightwards 


9.8  Numerical  Solution  of  the  Optimal  Controller 


In  the  previous  section  we  developed  an  algorithm  flowchart 
that  describes  the  steps  required  to  obtain  the  optimal  controller 
for  JLPC  problems  specified  by  (9.1)  -  (9.16).  However,  since  the 
optimal  JLPC  controller  is  not  piecewise  quadratic  in  x^»  in 
general,  we  don't  have  the  nice  inductive  solution  structure  of 
chapters  5  and  8.  At  each  time  stage  the  steps  specified  by  the 
optimal  controller  algorithm  may  be  difficult  or  impossible  to  carry 
out  analytically.  Numerical  methods  are  generally  required.  These 
difficulties  motivate  the  development  of  suboptimal  approximations 
to  the  optimal  JL.VC  controller  that  are  easie.  to  obtain  and  implement 
In  this  section  we  will  illustrate  the  optimal  JLPC  algorithm 
of  figures  9.5  -  9.13  by  applying  it  to  the  last  two  time  stages  of 
the  noisy  JLPC  control  problem.  That  was  begun  in  example  9.2 
(section  9.3).  This  example  yields  an  optimal  controller  that  has 
optimal  control  laws  that  are  not  piecewise-linear  in  x^.  We  will 
use  this  example  to  demonstrate  some  of  the  qualitative  properties 
of  optimal  JLPC  controllers  that  were  rstablished  in  section  9.6. 

The  determination  of  the  optimal  controller  at  time  k  *  N-2 
requires  the  solution  of  equations  that  are  difficult  or  impossible 
to  obtain  analytically.  Numerical  methods  for  obtaining  the  optimal 
controller  will  be  described  and  illustrated  for  this  example. 

The  difficulties  encountered  in  solving  this  example  at  time 
k  *  N— 2  motivate  the  development  of  a  suboptimal  approximation  to 


the  one-stage  JLPC  controller  solution  that  is  analytically  tractable 
in  section  9.9. 


We  begin  by  considering  the  application  of  the  optimal  solution 
algorithm  to  example  9.2  at  time  k  =  N-l: 

Example  9.4:  Example  9.2,  continued  at  k  *  N-l 

In  section  9.3  we  derived  the  tentative  z„  partition  and  the 

A  N 

A 

V  (z  lr„  =1,  z„  SA(t))  pieces  in  (9.47)  -  (9.48)  by  following  the 
N  N  N— 1  N 

steps  described  in  figures  9.5  -  9.7: 


1  zn  +  7/3 
4  N 


if  zN  <  -1 


VVrH-1*1> 


-1  3  5  2 

—  z  +  —  2 

8  N  2  N 


3  ,  83 

2  ZN  24 


if  -1  <  z  <  3  (9.146) 

N 


13  2  13 

4  ZN  3 


if  3  <  z, 


N 


Following  the  steps  of  figure  9.8,  we  find  that  inside  £^(2)  *=  (-1,3) 
we  have 


“VVVl"1'  ‘n8^2’’  6 

- 2 - -  8  +  5  >  0- 


3z, 


N 


Consequently  no  additional  grid  points  are  needed;  we  have 


Yn(D  -  -i 


Yi<2) 


In  block  63  we  obtain  (from  (9.126)  -  (9.128)) 


u1'0 

^-1 


-7/11  x, 


N-l 


1.0 


"  4/11  Vl 


7  2 


^iVi'11  -  ITVi  +  3 


In  block  67,  at  z  =  Y„(l)  =  -1  we  have 

N 


_  A  _ 

V^(z;l)  =  4.0833  <  7.5833  =  V*  (z ; 2) 
N  N 


so 


4:*  Vi'11  -vi*!  vi ♦ 5-0833 


is  an  eligible  candidate  cost,  As  we  have  shown  above  ,  has  a 

positive  second  derivative,  hence  the  answer  to  the  question  in  block 


81 


2,  U 


is  "yes".  Therefore  we  compute  V  ^  (x^.^,!).  block  84 : 


4^  ■  -Vl  *  9-3333  ■  •'83.1111  -  5.3333  Vl 


zx2/?  =  9.333  -  /83.1111  -  5.3333  x 

N-l  w-x 


fivi11  *  4-1  -  18-6667  Vl  + 192-718 


+  [-20.7778  +  1.3333xn_1]/83^1U-5.3333xn_1 


Returning  to  block  67,  at  z  *  YN(2)  =  3  we  have 


A1 

V,  (z ; 2)  =*  18.0833  <  33.5833  *  V  (z?3) 
N  W 
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hence  from  (9.124), 


V2,R  .  ..  2 


N-l  (Vl#1)  =  XN-1  '  6Vl  +  27  *0833 


is  an  eligible  candidate  cost.  Proceeding  through  blocks  68  -►  71  -►72 
■>  76  -*■  77  we  confute  (from  (9.126)  -  (9.128)): 


-  if  Vi  *  13/3 

Vi’11 

-  13 

17  XN-1 

ZN-^  (Vl'1} 

4 

S  -  V 

17  N-l 

The  eligible1 candidate  costs-to-go  for  vN_i^xN_l'r] 

vi,u 

N-l' 

i,R  2,U 

N-l'  N-l* 

and  V?'?.  Following  the 

N— 1  N— 1 

in  figure  9.10  we  obtain  the  9-0  grid: 


*  -2.75 

sti(2> 

=  8.0625 

CD 

52  H* 

ro 

-  -2.4375 

9N-Lt3> 

-  12.75 

These  values  are  computed  using  (9.129)  -  (9.130).  The  ordering 
specified  by  block  96  is 

<  9H-1<2>  <  0N-l‘2>  < 


Figures  9.11  -  9.13  are  then  followed  to  obtain  V„  ,  (x„  . ,r  .si). 

N-l  N-l  N-l 


according  to  Proposition  9.3 
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Since  these  steps  are  almost  identical  to  those  in  figures  8.9  -  8.11, 
we  will  not  describe  them  in  detail  here.  There  is  one  step  that  is 
difficult  to  carry  out  in  this  example.  That  is  the  determination  of 
the  intersections  of  (xM  ,,1)  with  (x  ,1)  in  block  102. 

N-l  N-I  N— 1  N-l 

We  obtain  the  value  xN  =  -.813  numerically. 

The  optimal  expected  cost-to-go  has  5  pieces: 


VN-l(*M-l,rM-l"1)"  1 


- 

N-l 

1. R  _ 

N-l 

2, U  = 
N-l 


V, 


N-l 


.636364  xjj  +  2.3333  if  <  -.215 

,+2x„  ,+5.08333  if  -2.75  <*.,  ,  <-.813 

N-l  N-l  N-l  — 

xf,  ,  -  18.6667  x  ,  »  192.718 
N-l  N-l 

[1.3333xn_1  -  20. 7778] ^83. 1111  -  5.333 

if  -  313  <x  ,<  8.0625 

x^_L  -  6xn_l  +  27.083  if  8. 0625 20.866 

.76471  x^_x  +  4.3333  if  XN_X  1  20.866 

(9.147) 


This  optimal  cost  has  a  piece that  is  not  quadratic  in  XN_^*  The 
corresponding  optimal  control  law  and  xN_x  t+  mappings  are 


as  follows: 


J. 

Vi 


1,R 

u 

N-l 
2  ,U 

Vi 


-.636364  x  ,  if  x„  .  <  -2.75 
N— 1  N— 1 


-vr1 


■X  .  +  9.3333 

N— 1 


if  -2.75 <  x  <  -.813 

N-l 


-/ 83.1111  -  5.3333 


2  ,R 

Vi 

3,U 

Vi 


"XN-1  +  V 


=  -.76471  x, 


N-l 


N-l 


if  -.813 <  x  .  <  8.0625 

N— 1 

if  8.0625  <  x„  <  20.866 
N-l 


if  x  .  >  20.866 
N-l 


ZN(XN-l'rN-l’1); 


!,U 

N 

!,R 

N 

2,U 


(9.148) 


N 


.363636  xN_L  if  XN_1  <  -2.75 


-1  if  -2.75  <  XN_1  <  -.813 

9.3333  -  /83.1111  -  5.3333 


XN-1 


,2'R  =  3" 


N 

53,U 

N 


.23529  x. 


N-l 


if  -.813  <  x  <  8.0625 

N-l 


if  8.0625  <  xXT  <  20.866 
N-l 


if  xN-1  >  20.866 


(9.149) 


These  optimal  quantities  are  shown  in  figures  9.14  -  9.16.  Note  that 
the  slope  of  V  , (x„  _  ,r  .=1)  is  discontinuous  at  x„,  =  -.813,  20.866 

N— 1  N— X  N— X  N 

Associated  with  these  discontinuities  are  the  regions  of  z  avoidance 

N 


(-1,  -.0179827)  and  (3,  4.90951) 


ZN{xN-l,rN-l*1)  in  examPle  9*2  (to  scale). 


as  specified  by  Proposition  9.6.  Note  that  the  mapping 


z  (x  ,,r  =1)  shown  in  figure  9.16  is  monotonely  nondecreasing, 

N  N-l  N-l 

as  claimed  in  Proposition  9.6.  O 

Example  9.5;  Example  9.2  at  k  =  N-2 

Obtaining  v  , (x  , ,r„  =1)  for  example  9.2  via  the  algorithm  of 

N— i.  N~1  N“1 

section  9.7  presents  no  significant  difficulties.  At  time  k  =  N-2, 

however,  things  are  much  different.  Many  of  the  algorithm  steps  must 

be  done  numerically.  In  particular,  it  is  difficult  or  impossible  to 

analytically  obtain  some  of  the  unconstrained  candidate  costs 

(and  their  associated  control  laws) ,  and  it  is  difficult  to  analytically 

t  ,U 

find  the  intersections  of  these  V  _  _  with  other  candidate  costs. 

N— 2 

In  this  example  we  demonstrate  how  the  algorithm  steps  cam  be  followed 
without  analytically  determining  the  functions. 

To  obtain  V  „ (x„  „ ,r  =1)  we  first  follow  the  steps  in 
N-2  N-2  N-2 

A 

figure  9.6,  to  obtain  the  x„  -  conditional  cost  V  , (x„  ,|r„  =1) 

N— 1  N— 1  N— 1  N— 2 

as  follows  i 
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Here 


lb1  =  6 

VN-1 

=  -2.75 

and 

=  8.065 

=  -.813 

Vilsl 

=  20.866 

=  1 

The  computation  of  V^x^lr^^3!)  via  (in  block  27)  can  be 


done  analytically  for  any  JLPC  problem  at  each  time  stage,  since  it 
involves  only  multiplication  and  addition  known  functions  of  x,  . 


Next  we  determine  V„  . (z„  .  r,  -1)  for  this  example.  Following 

N-l  N— 1  N— 2 

the  steps  in  figure  9.7  with 

a(l)  *  -2  a  (2)  -  2  (a  -  3  ) 

we  obtain  the  tentative  z„  ,  grid  points  (in  block  28) : 


A  (1,1) 

= 

8  (2,1) 

*  -.75 

■ 

Y  (4) 

A  (2,1) 

- 

8  (3,1) 

-  -4.75 

- 

Y(l) 

A  (1,2) 

3 

8 

(2,2) 

=  1.187 

X 

Y  (5) 

A  (2,2) 

3 

8 

(3,2) 

=■  -2.813 

- 

Y  (2) 

A  (1,3) 

= 

8 

(2,3) 

=  3 

X 

Y(6> 

A  (2,3) 

S 

8 

(3,3) 

-  -1 

X 

Y  (3) 

A  (1,4) 

X 

8 

(2,4) 

-  10.065 

« 

Y(8) 

A  (2,4) 

X 

8 

(3,4) 

«  6.065 

X 

Y(7) 

A  (1,5) 

X 

8 

(2,5) 

-  22.866 

X 

Y  (10) 

A  (2,5) 

3 

8 

(3,5) 

»  13.866 

X 

Y(9) 

with  \p  =  11. 

A 

The  z„  ,  conditional  cost  V„  , (z„  . I r  .*1)  is  then  determined  via 
N— 1  N-l  N-l'  N-2 

blocks  29-33  of  figure  9.7.  Its  ij)  •  11  pieces  are  as  follows: 

1.  if  zN-1  <  -4.75, 

A 

Vl(ZN-llrN-2*1)  “  (1-2773,zn-1  +  3.45307 


2.  if  -4.75  <  z„  .  <  -2.813  , 

N— 1 

/\ 

Vl'Vj-W11  '  ®7MIVl  +  (1.60U5)z^_i 


+  (1.53833) z  .  +  5.88873 
N— 1 


3.  if  -2.813  <  zN-1  <  -1  , 

A 

WVi'w11  •  ('022725)Vi  -  U3“5lVi  +  (28-9702)2,.-i 


+  (153.019) 


+  [-(.01875)2^  +  .254693]  // 72.444 


-5.333z. 


if  -1  <  ZN-1  <  “*75  ' 


Vl(VliV2‘1)  *  (.114391)2^  +  (1.38032)2^  +  (10.6467)2^ 
-(42.9246) 


+  [-(. 00625) zN_L  +  .084898]  //72.444  \3 

VI -5.333z.  ,  / 


if  -.75  <  2n_1  <  1.187  , 

A 

Vl(ZN-llrN-2*1)  "  (’091666)Vl  +  (1-32917)ZN-1  +  <10-6084)zn_i 


-  (47.9342) 


+  [-(. 00625)  2n_]_  +  .084898]  //72.444  \3 

V \ -5.3332  ,  / 
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if  1.187  <  z  .  <  3  , 

N— ± 

A 

V  , (Z„  .  |r„  =1)  =  ( . 091666) z3  +  (3 . 26667) z3  -  (32.3235)z 

N-l  N-l  1  N-l  N-l  N-l 

+  (251.697) 

72.444  \  3 

-5-333W 

+  [(.01875) *  L  -  .329693] /(93. 7778  -  5.333*^)  J 

if  3  <  z%,  ,  <  6.065  , 

N— 1 

A 

Vl'VlIV^11  =  (2 .65) z^_i  -  (4.66667)zN-1  +  51.7128 

+  [-(.00625)2^  +  .0848981  /(72. 4444  -  5.33333zN_x) 3 

+  [+(.00625)  zM  -  .109898]/ (93. 7778  -  5.33333z„  J3 

N— 1 

if  6.065  <  z„  .  <  10.065  , 

N-l 

A 

Vl(VllrN-2-1)  -  <3-°««3>2h-1  '  <l3-"355)zN.l  +  102.159 

+  H.006251Z  -  .109898)  /(93.7778  -  5. 33333*^) J 


+  [-(.00625)z. 


N-l 


.084898] 


if  10.065  <  z„  .  <  18.866  , 

N-l 

Vl'VlI^-l-1’  •  <2-65)2n-1  -  (L5lVl  +  (10.3041) 


10. 


if  18.866  <  z, 


<  22.866 


N-l 

/v 

Vi'VilW11  -  +  <2-8075>Vi 

-  (2.23188) z  .  +  1.6278 
N— X 


11.  if  22.866  <  ,  , 

N-l 

/s 

Vi'Vil-W1’  *  (!-5,lVi  +  4-33663 


Obtaining  the  zN  ^  partition  for  this  example,  and  obtaining  the 
z^+1  partition  for  arbitrary  JLPC  problems  at  time  k  (as  in  block  28) 
does  not  present  any  special  difficulties.  For  some  problems,  however 

A 

A 

rk=j)  as  a 

function  of  zk+1»  In  example  9.2,  finding  vN_i *zn-i I rN-2“'1'^  via 
figure  9.7  is  straightforward  since  the  integrals  in  blocks  40  and 

A 

A 

41  can  be  done  analytically.  For  arbitrary  Vk+1(zK+1|rk=*j) ,  if 
these  integrations  cannot  be  done  analytically  then  numerical  methods 
of  integration  must  be  used. 

Comparing  VN_1(zN^1 1 rN_2*l)  with  VN <zN j rN_1=l )  in  (9.14$), 

we  see  that  the  z-conditional  cost  is  much  more  complicated  at 

stage  N-2.  Following  the  steps  of  figure  9.8,  we  find1  that  no 

additional  z„  .  grid  points  are  needed.  Thus  the  complete  z„  . 

N-l  N— 1 

partition  is 


'this  can  be  done  by  numerical  methods  (substitution  of  values  and 
testing)  or  analytically  in  this  example.  For  some  examples  only 
numerical  methods  are  feasible. 


it  may  be  difficult  to  determine  the  z-cost  vk+^ (zk+1 


700 


yn-i  <2)  “  -2*813 


Vi  (3)  -  -1 


Vi  (4)  a  -*75 


Vi  (5> 


'  N-l 


=  1.187 


(6)  =  3 


YN-1  <8)  “  10*065 


A1 

Y„  .  O)  =  18.866 

N-l 


Ai 

yn-i(10)  a  22-866 


with  <{>  .  (1)  *»  11.  The  second  derivative  of  V„  ,  (z„  .|r  =1)  is 

N-l  N-l  N-l  N-2 

everywhere  positive  in  this  example.  That  is 


3z 


>0  for  *  <^-1 ''-!>■ 


N-l 


Next  we  follow  figure  9.9  to  determine  which  candidate  costs 
are  eligible  in  terms  of  Proposition  9.3.  The  tests  in  figure  9.9 
can  be  done  without  actually  computing  any  of  the  candidate  costs. 
Only  the  values  of 


and 


3vm(Vi;t) 


3z 


k+1 


*4  ^4 

at  the  grid  points  (Y^+1(t)  :  t  ■  1,...,  i^+1>  are  needed  in 


blocks  67,70  and  74. 


For  this  example,  the  values  of  V„  . (z„,  ,  r„  =1)  and  its 

N-i  N— 1  N-l 

derivative  at  the  z  1  grid  points  are  listed  in  table  9.5.  Note 

that  v  ,  (z..  ,  |r„  *1)  is  continuous  in  z„  .  . 

N-l  N-l  N-l  N-l 

Using  these  values  in  figure  9.9,  the  list  of  eligible 
candidate  costs  for  VN_2  ^-2  ,rN-2=1^  is  found  to  196 


vS-2  'Vj'Vj*11  for  *  =  1 . 11 


and 


6,R 


1  ,t 


VN-2  (XN-2,rN-2*‘1)  "  VN-2(XN-2,rN-2  1) 


The  next  task  is  to  obtain  the  8-0  grid,  as  in  figure  9.10 
For  this  example 


3z 


N-l 


+ 


L»J1>  >  0 

b2(j) 


for  each  t  =  1,...,11,  We  compute  {8^(t),  O^(t-l)  t  =  l,...,ll} 

directly  in  block  92  from  (9.129)  -  (9.130),  using  the  values 


in  table  9.5! 


N-l 


A 


VN-1(ZN-1  rN-2=1) 


^N-l^N-l  rN-2=1) 


2) z 


N-l 


-4.75 

32.272 

-12.134 

-4.75+ 

32.272 

-12.134 

-2.813" 

13.725 

-6.930 

-2.813+ 

13.725 

-6.930 

-l" 

5.216 

-2.4425 

-1+ 

5.216 

-2.4886 

-.75” 

4.695 

-1.6741 

-.75+ 

4.695 

-1.6741 

1.187" 

8.33 

5.7516 

1.187+ 

8.33 

5.7516 

3" 

27.09 

15.2791 

3+ 

27.09 

15.3251 

6.065" 

98.16 

31.034 

6.065+ 

98.16 

31.034 

10.065" 

263.66 

51.845 

10 . 065+ 

263.66 

51.845 

18.866" 

905.62 

98.4898* 

18.866+ 

905.62 

98.3524 

22.866" 

1358.73 

118.4459 

22.866+ 

1358.73 

118.4459 

di scontinui ty 


discontinuity 


discontinuity 


Table  9.5! 


Values  of  “  conditional  Cost  and  its 

Derivative  to  the  Left  and  Right  of  each  grid 

A  A. 

point  WN_x(t)  :t  -  1,...,  -  ll) . 
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0N-2(1) 

-  -10.817 

at 

0;.2<2' 

Q1  <2> 
N-2 

*  -6.278 

- 

0»-2(3> 

8J-2(3> 

=  -2.221 

9»-2l4> 

=  -2.244 

SN-2(4) 

=  -1.587 

s 

8N-2<5> 

8N-2l5> 

-  4.0628 

3 

e»-2(6) 

°N-2<6> 

-  10.6395 

9»-2l7> 

■  10.6625 

eN-2<7> 

-  21.582 

= 

9H-2l8> 

9N-2(8> 

-  35.987 

3Z 

9N-2t9) 

®N-2!9> 

■  68.111 

o 

H 

CM 

1 

-I  S3 

<D 

«  68.042 

8iU(lo) 

**  82.089 

3 

9N-2!11> 

Now  we  follow  the  steps  outlined  in  figures  9.11  -  9.13  to 

.  .  .  ..  .  , ,  _  .  ,  We  can  find 

determine  VN_2(xN_2,rN_2-l)  for  each  xN_2  value. _ 

all  of  the  boundaries  of  unconstrained  cost  domains  of  validity  and 

most  of  the  intersections  between  candidate  costs  without  explicitly 
analytically  determining  all  of  the  <*„_2 .  D  functions  To  do  this 

w.  need  to  find  the  constrained  cost  functions  v'-1  and 


From 

(9. 

117)  , 

(9.120) 

of 

Proposition  9 

1.2  and  table  9.5  we  have 

V1,R 

N-2 

“ 

v2'L 

N-2 

2 

"  V2 

+ 

9'5XN-2 

+ 

54.8346 

V2,R 

N-2 

« 

V3,L 

N-2 

2 

“  XN-2 

+ 

5.626z„  0 

N—  Z 

+ 

21.638 

v3,R 

N-2 

a 

v4'L 

N-2 

2 

"  V2 

+ 

2XN-2 

+ 

6.21595 

V4,R 

N-2 

- 

v5,L 

N-2 

2 

*  XN-2 

+ 

1*5XN-2 

+ 

6.2573 

v5,R 

N-2 

a 

6.L 

v 

N-2 

2 

*  V-2 

- 

2'374xN-2 

+ 

9.7420  (9.151) 

V6,R 

N-2 

a 

v7'L 

VN-2 

2 

’  *H-2 

- 

6XN-2 

+ 

36.09 

V7,R 

N-2 

a 

8,L 

V 

N-2 

2 

a  X 

N-2 

- 

12*13XN-2 

+ 

134.88 

V8' R 

N-2 

a 

v9'L 

N-2 

2 

"  XN-2 

- 

20‘13xN-2 

+ 

364.9 

v9,R 

N-2 

a 

V10'L 

N-2 

2 

"  v2 

- 

37.732x  , 

N—  £ 

+ 

1254.01 

,10  ,R  _  „11,L 


2 

^-2 


45.732x„  „  +  1881.58 
N-<6 


N-2  N-2 

Using  these  easily  obtained  constrained  cost  functions  we  can 


avoid  having  to .analytically  determine  the  unconstrained  cost 

1 


functions  »as  we  will  demonstrate  below  in  detail. 


Now  we  consider  in  turn  each  x„  „  interval  in  the  9-0 

N— ^ 


partition  specified  by  (9.150). 


For  x„  _  sufficiently  negative,  we  know  that 
N— £ 


,1/U, 


VN-2(X»-2'CN-2*1)  '  V2(XN-2'1) 


^1  i  u 

Since  vn_2*zn-i;1^  is  <Iuadratic  in  zn-i?  Vn-2^XN-2'1)  is  cTuadlrat:Lc 


in 


*N-2' 


these  details  are  expounded  in  the  next  five  pages. 
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In  the  interval  (-00,  -10.817  =»  0N  ^(1)),  the  only  two  valid 

eligible  candidate  cost  functions  are  and  5  vj 

N-2  N-2  N-2 

From  (9.151)  and  Proposition  9.2  we  have 


V6'R 

N-2 

=  v7'L 
N-2 

= 

217.995 

at  x„  _  =  10.817 

N-2 

V1'0 

N-2 

=  v1'* 
N-2 

= 

69.081 

at  x  =  -10.817 

N— 2 

. 

Consequently  (from  Proposition  9.6(5)), 

VH-2(V2'rH-2*l)  =  VN-21V2'1)  fOT  V2  <  *10-817  • 

l.o 

In  -10.817  <  x„  _  <  -6.278,  V„  _  ceases  to  be  valid.  The 

N— 2  N— 2 

valid  eligible  costs  are  V2'^  and  V*!'R  •  V7'^  .  Using  (9.151) 

N— 2  N-2  N— 2 

and  (9.132),  (9.134)  of  Proposition  9.2: 


at 


at 


and 


N-2 


=  0*  ,(2)  =  -10.817 

N-2 


,2  ,U  =  y2,L 


N-2 


N-2 


69.081 


N-2 

,2,U 


=  G>N-2(2)  =  6-278  ' 


,2,R 


N-2 


N-2 


25.731256 


t6,R 


N-2 


N-2 


=  113.17128 


.2,0. 


Consequently  VN_2 {xN-2'rN-2*1}  =  VN-2(XN-2'1}  0ver  (~10-817'  "8*278) 


2,U 


In  the  interval  -6.278  <  x„  .  <  -2.244.  V  ceases  to  be  valid 


The  valid  eligible  costs  are  V3'!?  and 

N-2  N— 2  N— 2 


=  9"  ,(3)  =  -6.278, 


XN-2  "  9  N-2 


VN-2  '  25'732  ' 


so  V  '  is  optimal  there.  But  what  is  the  numerical  value  of 

N-2 

VN-2  at  XN  2  =  "2*244  =  ®n-2^4^  ?  From  Proposition  9.2  we  know  that 


3,U  1  3Vi(2n-i;3) 

UN-2(XN-2'1)  ”  2  3zm 

N-l 


3,U  4.  3#U 

Vl  ■  Vl  '  *N-2%-2 


(9.152) 


3,U  3,U 

XN-2  *  ZN-1  “  V2 


(9.153) 


with  resulting  cost 


v3'”(x  ,1) 

N-2  m-2 


02  +  ii(Vii3) 


(9.154) 


where 


2N-1  e  <^-l(2)'  Vi(3))  =  (-1'-2’813)  • 


Using  (9.152)  -  (9.154)  we  need  to  find  V  '  (x  ,1)  for  x„  _  near 

N-2  N-2  N-2 


-2.244  and  compare  it  with  the  value  of  -  v^2  at  t^iat  xn-2' 
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is  optimal  for  any  x 


6  ,R  7  ,L 

in  order  to  determine  if  V„  .  =  V 

N-2  N-2 

in  (-6.278,  -2.244).  We  find  that  to  obtain  zZ'Z 

N-l 

control  is 


N-2 
=  -1,  the 


»  +1.2213 


/ 


applied  to 

V2  =  “2*22 

with  resulting  cost 


Since 


6.7074. 


*  54.34 


at 


xN_2  *  -2.22,  it  is  clear  that  V^2  is  optimal  over  the  entire 
interval  (-6.278,  -2.244) . 


Next  we  consider  the  interval  9^  _ (4)  =  -2.244  <  x„  _  <  -2.221 

N— 2  N-2 

=  0^  _  (3) .  Here  vZ'!?  is  the  prevailing  optimal  at  x„  _  **  -2.244 
N— 2  N-2  N— 2 

and  the  costs  and  *  vZ'^  are  valid,  eligible  candidates 

N— 2  N— 2  N— 2 

for  optimality.  At  xN_2  *  -2.221  =  0^-2^' 


=  6.7067 


and  since  <  V^'1^  for  all  x  ?  8*  -(4)  ,  we  know  that  V2'!^ 

N-2  N-2  N-2  N-2  N-2 

will  cross  somewhere  inside  (-2.244,  -2.221).  We  will 

N—  2 

1 

call  this  point  of  intersection  (5n_^£3)  •  We  defer  its 

determination  until  later  in  this  discussion. 


54.36 


-2.221 


we  have  that 


/N-2(XN-2,1)  “  j  4,U 


for  -2.244  <  xN_2  <  6^3) 
for  _ (3)  <  x„  .  <  -2.221 

N— 2  N— 2 


where  -2.244  <  6^_2 (3)  <  -2.221. 

In  the  interval  (-2.221,  -1.587)  the  only  eligible  valid 

candidate  costs  are  V4'^  and  vf ,'R  =  V7'^  .  Since  at  x„T  _  =  1.5 

N-2  N-2  N-2  N-2 

we  have 


V6,R  =  v7'L 
N-2  N-2 


48.1306 


V4'U  ■  V4'L  *  6.3954  , 

N-2  N-2  ' 

4  U 

we  see  that  VN'2  is  over  the  entire  interval  (-2.221,  -1 

In  (-1.587,  4.0628)  the  eligible  valid  candidates  are  V 

+£  -  - 


•  St  V2  ■  -1-587  *  e»-J<5)  ‘  4-2(4>  ' 


V5'“  .  V5,L 
N-2  N-2 


6.395 


4.0628  = 


4-2  <5) 


4-2  < 161 


=  16.6033 


V6,R  _  V7,L 


28.2195 


is  optimal  over  (-1.587,  4.0628). 


Therefore  V _ 

11- 

Over  the  next  interval  (4.0628,  10.639),  the  eligible  valid 

, .  .  .  „6,U  .  „6,R  ..7,L  „6,U  .  6,R 

candidates  are  V  '  and  v '  =  v  ,  .  Since  V„  ,  <  V  _  except 
N-2  N-2  N-2  N-2  N-2 

at  0^.2 (6), 

WVa'W11  -  VH-2  for  4-0628  4  V2  4  10-639- 


Over  (10.639,  10.662)  the  only  eligible  valid  candidate  is 


^  =  v^2  '  so  optimal.  Over  (10.662,  21.582)  the  only 

eligible  valid  candidate  is  so  optimal.  Similarly, 

8  U  8  U 

V  '  is  optimal  over  (21.582,  35.987)  and  V  '  is  optimal  over 
N— 2  N— 2 

(35.987,  68.042). 

Now  in  the  interval  (68.042  =  0^  ,(10)  ,  0*.  ,  (9)  =  68.111)  the 

N-2  N-2 

cost  functions  and  V^°'U  cure  both  valid  eligible  candidates. 

N— 2  N-2 

At  x  ,  =  68.042,  is  optimal.  At  x  ,  =  68.111  0^  (9)  , 

N-2  N-2  N-2  N-2 


^9,0 

N-2 

,10,  U 


9,R  _  V10,L  >  v10'u 

N-2  =  N-2  n-2  > 


30  vi°;U  is  optimal  here.  In  (68.042,68.111)  we  have  the  inter- 

N— 2 


,9»U  „10,U 


section  of  and  Vn-2 


which  we  will  denote  by  $j|j_2(10). 


'1-2  £°r  68'042  <  Vi  <  SN-2a°> 

H-2U  £“  4-21101  4  V2  4  68-lU  • 

In  (68.111,  82.089),  V^'U  t*ie  on*y  valid  eligible  candidate 
so  it  is  optimal.  Similarly,  V^'Uis  optimal  for  x„,  >  82.089. 


VN-2(XN-2,rN-2=1) 


where  we  have  yet  to  determine  the  two  joining  points  which  lie  in 
the  intervals 

-2.244  <  51,  (3)  <  -2.221 

68.042  <  <^_2(10)<  68.111 

Some  of  the  unconstrained  costs  in  (9.155)  and  their  correspond¬ 
ing  control  laws  can  be  easily  obtained  analytically.  In  particular 
the  optimal  controller  endpieces  are 

TH-2(*N-2'1)  *  <■ 560881  V2  *  3  • 45307 

^2(V2-“  “  -<-56088>  V2 

VN-2°(:'U-2'1)*  <‘72145>  *  4 ' 5366 

vi^-11  -  -I-721441  V: 

t,u 

However,  the  other  unconstrained  cost  pieces  V  (t=2 , . . . ,10)  are 

N—  2 

harder  to  obtain.  In  particular,  in  order  to  analytically  solve  for 

uZ'i?  and  V„#!?  (as  functions  of  x„  _),  we  must  solve  a  sixth  degree 
N-2  N-2  N-2 

polynomial  in  u  which  must  in  general  be  done  numerically. 

However,  we  can  numerically  obtain  the  value  of  the  optimal 

control  and  expected  cost-to-go  for  any  XN_2'  Therefore  we  can 

determine  the  optimal  controller  for  as  fine  a  mesh  of  x%,  _  values 

N-2 

as  desired. 

The  procedure  for  doing  this  is  as  follows: 


For  intervals  of  values  where  a  constrained  cost 
^Vk  or  vjc/  ^xk'^)  is  optimal  we  obtain 

\(xk'j)  and  Vk(xk'j)  Erectly  from  (1)  -  (2)  of 
Proposition  9.2. 

For  intervals  where  an  unconstrained  cost  v£'0(j^,j)  is 
optimal  we  obtain  u^x^j)  and  v^x^j)  =  v£'D  as 
follows : 


(i)  for  arbitrarily  chosen  z^+^ 
(Yk+l(t_1) '  Yjj+1(t) )  find  \ 


values  in  the  constraint  region 
from  (9.126): 


~b(j) 

2R(j) 


3V: 


k+1 


(zst) 


3z 


z=z 


k+1 


Since  we  have  V”+1  (z^+^;t) ,  we  can  differentiate 
(numerically  or  analytically)  to  obtain  the  above 
quantity , 


(ii)  We  then  find  the  x^  value  that  corresponds  to 
obtaining  z^^(x^  j)  *  z  with  this  u^: 

*k  "  z  *  \ 

(iii)  We  can  then  obtain  the  corresponding  value 
of  V^'U(x^,j)  from  (9.128) : 

A 


for  this  z  and  u.  value. 


We  repeat  this  procedure  for  as  many  values  as  needed. 

We  can  use  this  procedure  with  each  candidate  cost  in  block  102 

(of  figure  9.12)  to  determine  the  intersections  of  candidate  costs. 

Applying  this  procedure  to  example  9.2  we  obtain  the  optimal  control 

expected  cost-to-go  and  resulting  z„  .  value  for  a  number  of 

N— 1 

xfI_2  values,  as  shown  in  tables  9,6  -  9.7  •  The  joining  points 
resulting  from  crossing  candidate  cost  are  found  to  be  approximately 


6N-1(3)  ~  “  2,233 

6*  no)  X  68.10  . 

N-2 

From  tables  9.6,  9.7  we  see  that  the  optimal  control  law 
uK_2^XN-2'rN-2*1*  iS  discontinuous  at 

Va  ‘  5i-2<3)~  -2-233 

2  -  5N-2(1>  ~68-10  • 

Associated  with  each  control  law  discontinuity  is  a  region  of 

z„  .  avoidance : 

N— 1 

(-1.005,  -.996)  and  (18.863,  18.882)  . 

We  also  note  that  V„  _(x„  _,r„  =1)  has  its  minimum  value  near 

N— 2  N— 2  N—2 

XN_2  *  “.2204.  Evaluating  VN_2 ^XN-2'rN-2=1^  for  XN-2  near  “•2204' 
we  find  that  the  minimizing  x^^  is 


with 


u.,  „ 


-.003 


and  V  „  -  4.28517. 


V2 

\-2 

ZN-2 

VN-2(XN-2' 

-20 

11.2167 

-8.7833 

227.806 

-12 

6.7300 

-5.2694 

84.213 

-10.817 

-  4-2  a> 

6.0665 

-4.75 

69.075 

-10 

5.7938 

-3.933 

60.032 

-  8 

4.4650 

-3.535 

36.004 

-  7 

3.8861 

-3.1139 

31.040 

-6.278  = 

4-2  121 

3.4651 

-2.813 

25.732 

-  5.581 

3.0809 

-2.5 

21.168 

-  3.344 

1.8443 

-1.5 

10.151 

-  2.243 

1.2337 

-1.010 

6.763 

-  2.233" 

~  4-2<31' 

1.2275 

-1.005 

6.73 

-  2.233+ 

51  ^N-2(31 

1.2379 

-.996 

6.73 

-  2.114 

1.1644 

-.95 

6.482 

-  1.984 

1.0838 

-.90 

6.158 

-  1.587 

■  4-2 141 

.8371 

-.75 

5.395 

-.9117 

.4117 

-.5 

4.551 

-.2204 

-.0296 

-.25 

4.287 

-.0802 

-.1198 

-.2 

4.308 

.2021 

-.3021 

-.1 

4.427 

.4869 

-.4861 

0 

4.651 

1.2102 

-.9602 

.25 

5.697 

1.9493 

-1.449 

.5 

7.478 

Table  9.6;  Optimal  Controller  at  Time  k=N-2  for 

Example  9.2  (Part  I) 
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V2 

V2 

ZN-2 

V2(V2'ri 

3.4756 

-  2.476 

1 

13.463 

4.0628 

a  61  (5) 

N-21  ' 

-  2.876 

1.187 

16.605 

4.2743 

-  3.024 

1.25 

17.849 

6.8866 

-  4.887 

2 

38.491 

10.639  * 

5N-2(6> 

-  7.639 

3 

85.44 

10.662  = 

K-2<7> 

-  7.662 

3 

85.79 

14.229 

-10.229 

4 

149.62 

17.792 

-12.792 

5 

231.64 

21.351 

-15.351 

6 

331.79 

21.582  = 

5 1  (8) 

N-2 

-15.517 

6.065 

338.93 

24.916 

-17.916 

7 

450.29 

28.504 

-20.50 

8 

588.12 

35.987  = 

SN-2<9! 

-25.92 

10.065 

935.52 

40 

-28.84 

11.164 

1155.4 

50 

-36.10 

13.904 

1804.7 

68.042 

-49.19 

18.847 

3343.5 

68.10“ 

+ 

-49.24 

18.863 

3349.2 

68.10 

-49.22 

18.882 

3349.2 

68.111 

-49.23 

18.885 

3350.3 

70 

-50.58 

19.422 

3538.8 

75 

-54.15 

20.849 

4062.5 

82.089  « 

-59.20 

22.886 

4866.1 

90 

-64.93 

25.07 

5848.3 

100 

-72.14 

27.86 

7219.0 

Table 9. 7  :  Optimal  Controller  at  time  k*N-2  for  Example  9.2 

(Part  II) 
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It  is  important  to  note  that  we  can  obtain  the  optimal 

controller  for  any  JLPC  problem  of  this  chapter  in  the  manner 

demonstrated  for  example  9.2.  However,  a  large  number  of  numerical 

calculations  may  be  necessary  in  order  to  accurately  approximate  the 

optimal  controller,  and  the  resulting  controller  may  not  be  easy 
to  implement.  In  the  next  section  we  will  consider  a  suboptimal 

approximation  of  the  JLPC  controller. 


The  algorithm  for  determining  the  optimal  controller  of  sec¬ 
tion  9.7  and  illustrated  in  section  9.8  allows  us  to  obtain  (or  to  numer¬ 
ically  approximate  arbitrarily  well)  the  optimal  controller  for  any  noisy 
JLPC  problem.  However,  this  controller  may  have  certain  undesirable  pro¬ 
perties.  Specifically, 

•  it  may  be  too  difficult  (i.e.,  require  too  many  calculations)  to  per¬ 
form  all  of  the  analytical  and/or  numerical  tasks  required  to  derive 
the  optimal  controller 

the  resulting  controller  may  be  too  complicated  for  cost-effective 
implementation. 

Consider  the  example  described  in  the  previous  section  at  time  stage 
k=N-2.  If  we  need  to  obtain  the  optimal  control  law  for  a  fine  mesh  of 
x^_2  values  then  the  number  of  calculations  may  be  prohibitive.  Imple¬ 
mentation  of  the  optimal  controller  for  x4/  intervals  where  the  optimal 

/V-z 

control  law  is  not  analytically  available  will  require  a  "table  look-up" 

operation  (and  interpolation  for  x  values  between  table  entries)  in 

/V- 2 

order  to  determine  the  control  input  to  be  applied  for  any  encountered 

x  .  These  implementation  tasks  may  be  too  expensive  or  too  time  con- 

z 

suming  to  be  economically  feasible. 

These  difficulties  motivate  the  development  of  a  suboptimal  approxi¬ 
mation  of  the  optimal  controller  that  is  easier  to  determine  and  implement 
In  this  section  we  will  consider  a  suboptimal  approximation  of  the 
optimal  controller  that  drives  the  system  to  one  of  a  set  of  arb¬ 
itrarily  chosen  values. The  basic  idea  is  as  follows: 

#  at  each  time  stage  k  we  designate  a  set  of  values  that  the  con¬ 

troller  may  hedge  to  from  (x,  ,r.  *j) .  The  cost  of  hedging  from  any 


to  a  specified  z^+1  is  quadratic  in  x^.  These  quadratic  costs 
are  compared  for  each  x^,  and  the  control  law  corresponding  to  the 
lowest  one  is  chosen. 

This  approximation  method  yields  a  controller  that  has  control  laws  that 
are  piecewise-linear  in  x^,  and  piecewise-quadratic  expected  costs-to-go. 
This  is  essentially  a  brute  force  approximation  of  the  optimal  controller. 


In  principle,  if  we  choose  enough  target  z^+^  values  at  each  time  k,  we 


can  obtain  arbitrarily  good  approximations.  An  open  question  is  how  to 
intelligently  choose  these  target  values.  One  reasonable  set  of  target 
choices  are  the  discontinuous  points  of  the  z-conditional  cost 


V^^Cz  jjr  =j),  since  we  know  that  the  optimal  controller  may  hedge 

to  such  points.  In  the  example  below  we  have  chosen  these  points  and  a 

grid  of  values  in  between.  The  performance  of  the  suboptimal  controller 

is  unsatisfactory  after  two  time  stages.  At  least  for  special  classes 

of  JLPC  problems,  it  should  be  possible  to  use  knowledge  of  the  struc¬ 
ture  of  the  problem  to  obtain  a  better  approximation  of  the  optimal 

controller.  We  have  not  addressed  the  topic  of  approximation  of  the 

optimal  JLPC  controller  in  detail  here. 

This  approximation  of  the  optimal  JLPC  controller  consists  of  the 
tasks  specified  in  the  flowcharts  of  figures  9. 5-9. 7  and  9.17,  which  can 
be  summarized  as  follows: 

1.  The  overall  algorithm  framework  is  described  by  figure  9.5,  as  in 
the  optimal  algorithm, 

2.  The  x^+1~conditional  cost  and  x^+1  grid  are  obtained  in  figure 

9.6  and  block  27  of  figure  9.7  as  in  the  optimal  algorithm  except 
that  the  approximate  V).  |  ^  ^ ,  ^«i)  (obtained  in  the  oreceed- 

ing  iteration)  is  used  instead  of  the  true  optimal  cost. 


.9 


«* 


3'  The  Zk*i'c“"ditional  cost  \tl(zk»llrk-i>  ^  computed  via  figure 
9.7,  exactly  as  in  the  optimal  algorithm  (except  that  the 

Vk+l(xk+l  rk“j)  from  steP  2  is  used  instead  of  the  true  one. 

4.  We  then  follow  the  steps  of  figure  9.17  (replacing  figures  9.8- 
9.13)  of  the  optimal  controller  derivation  algorithm. 

In  table  9.8  we  list  the  block  numbers  of  the  flowchart  for  this 
approximation  scheme.  The  circled  numbers  in  the  table  refer  to 
points  where  the  flowchart  control  path  enters  and  leaves  the 
different  figures.  ' 


Figure  Number 

Block  Numbers 

Entry  Points 

Exit  Points 

9.5 

1-13 

start  (block  1) 

stop  (block  12) 

9.6 

14-26 

from  block  10 

0  (block  22) 

9.7 

27-42 

Q  (block  27) 

0  (block  33) 

9.17 

43-59 

0  (block  43) 

block  59 

Table  9.8:  Block  number  locations ,  entry  points  and  exit  points  for  sub- 
optimal  approximate  controller  derivation. 
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Let  us  now  consider  the  suboptimal  approximation  method  in  detail. 


At  each  time  stage  k  and  in  each  form  j£  M_  we  begin  by  calculating  the 
z^+1  cost  r^*j)  via  the  steps  specified  in  figure  9.7. 

T^1S  Vk+l^Zk+l  I  r*=j)  '3aset^  uPon  the  approximate  expected  cost-to-go 

A/ 

Vk+1  ^Xk+l'rk=j)  ^comPuted  in  the  preceeding  iteration). 


In  the  process  of  determining  V^^fz^^jr^j)  in  figure  9.7  we  also 

obtain  the  partition  grid-points  (y(l),...,  y(L)  }•  Note  that  for  this 

* 

suboptimal  controller  we  do  not  need  to  differentiate  vk+^ ( z^+1J  rk=j)  '  nor 
do  we  have  to  add  the  extra  z^+1  partition  grid  points  that  are  used  in 
the  optimal  controller  derivation  (in  figure  9.8). 

As  we  indicated  above,  the  basic  approximation  idea  is  to  calculate 
and  compare  the  costs  associated  with  hedging  to  each  member  of  a  set  of 

specified  z^+1  values.  Included  in  this  list  \+i  9rid  points  (ob- 

£ 

tained  in  figure  9.7)  where  vjc+j_^2|c+]>lrk“^  is  discontinuous.  Let  N(z^+^) 

denote  the  number  of  these  target  z^+1  values;  that  is,  we  designate  a 
2 

list  of  values  ^zk+l  =  z^!  *  =  !»•  •  •  »N(z^+^) } 

that  the  system  is  required  (by  our  suboptimal  approximation)  to 

hedge  to.  This  list  of  target  points  includes  the  discontinuous 
points  of  the  z-conditional  cost  (where  we  know  that  the  true  opt¬ 
imal  controller  may  hedge  to)  and  we  have  chosen  additional  target 
points  in  between. (hopefully  enough  for  a  good  approximation) .  For 


each  of  these  target  points  z(i),  the  control  law 


u(i)  =  -L(i)  x,  +  F (i) 


(9.156) 


drives  the  system  to  zk+^  =  z(i)  from  any  (x^,r^=j)  with  the 
resulting  quadratic  cost 

V(i)  =  K(i)x2  +  H(i)xk  +  G(i) 
where  the  coefficients  in  (9.157)  are 
L (i)  *  a(j)/b(j) 

F  (i)  =  z(i)/b(j) 

K (i)  *  a2(j)  R(j)/b2(j) 

H (i)  =  -2a(j)R(j) z(i)/b2(i) 

G (i)  =  z2(i)R(j)/b2(j)  +  Vk+1(z(i)  rk=j) 


(9.157) 

(9.158) 

(9.159) 

(9.160) 

(9.161) 

(9.162) 


Note  that  the  index  i  in  (9 . 156) - (9. 162)  refers  to  the  target  point 
z(i).  The  parameters  F(i)#  H(i)  and  G(i)  here  depend  upon  z(i) . 

Note  that  G(i)  (and  hence  V(i))  is  not  quadratic  in  z;  however 
V(i)  is  quadratic  in  x^  This  is  the  motivation  for  this  approx¬ 
imation  scheme;  we  are  obtaining  a  quadratic  controller  structure 
by  evaluating  Vk+^(z(i)  rk= j )  only  at  specific  points. 

For  each  x,  value  we  choose  ^u,  (x,  ,  r  =j)  to  be  the  control  law  in 
k  k  k  k 


(9.156)  that  yields  the  least  expected  cost  in  (9.157).  To  do  this  we 

must  find  the  intersections  of  the  costs  V(i) :i=l, . . . ,N(z  .) 

k+1 

in  (9.15*7).  That  is,  we  obtain  the  approximate  controller  expected 


cost  Vk(xk,rk=j)  via  the  minimization 


Vk(Xk'rk=j>  =  min 


l"»] 


1  ■  1 . "Vi1 


(9.163) 


at  each  xk  value.  Consequently  the  approximate  controller  cost 
Vk(xk»rk=j)  is  piecewise-quadratic  in  xk<  Consider  now  the  minimization 
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in  (9-163).  From  (9.157),  V(i)  intersects  V(j)  at 


G(j)  -  G (i) 
H  (i)  -H  ( j ) 


(9.164) 


Therefore  we  can  solve  (9.163)  for  all  x^  by  determining  the  intersections 
of  the  V(i)  using  (9.164).  Following  the  basic  idea  of  the  controller 
algorithms  of  chapters  7-9,  we  need  not  solve  for  all  of  these  inter¬ 
sections.  For  x^  sufficiently  negative,  the  lowest  cost  i’¬ 
ll  V(i)  :  i  =  1, . . .  ,N(z^+1) \  will  be  V(l)  (that  is,  the  leftmost  target 
value  is  hedged  to),  when  a(j)>0  (or  the  rightmost  when  a(j)^O). 

We  then  sweep  rightward  along  the  axis  of  a(j)x  values.  At  some 

iC 

value  one  of  the  other  V(j)  costs  in  (9.157)  will  intersect  V(l)  and 
will  then  become  the  prevailing  optimal  cost  until  it  in  turn  is 
intersected  by  another  cost.  Since  the  mapping 


is  monotone  nondecreasing  for  a(j)>o  we  need  only  consider  the 


intersections  of  the  prevailing  cost  V(n)  with  V(n+1) , . . . ,V(N(z^+^) ) . 


This  suboptimal  controller  will  consist  of  pieces  (at  time  k  from 

form  j) ,  where  iw^Cj)  increases  at  most  linearly  with  the  number  of  tar- 

get  Zk+1  values'  N(z)c+i); 

The  computational  and  implementation  advantages  of  this  suboptimal 
controller,  relative  to  the  optimal  JLPC  controller  derivation  algorithm, 
are  as  follows:  for  the  approximate  controller 


1. 


We  need  not  compute  the  functions 


s  WVilV1’ 


A 

’li'Vi 
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1.  We  need  not  determine  (analytically  or  numerically)  any  of  the 
„t,U 

costs,  nor  do  we  have  to  determine  a  0-0  grid  as  in 
figure  9.10. 

3.  The  candidate  cost  functions  are  all  quadratic  in  so  it  is 
easy  to  determine  their  intersections . 

4.  The  suboptimal  control  law  is  piecewise-linear ,  unlike  the  opti¬ 
mal  controller;  thus  the  suboptimal  controller  will  often  be 
easier  to  implement. 

5.  The  suboptimal  controller  has  the  same  type  of  structure  (i.e., 
piecewise-linear  control  laws)  at  each  time  stage,  unlike  the 
(in  general)  changing  structure  of  the  optimal  controller.  This 
also  facilitates  implementation  of  the  suboptimal  controller. 

6.  The  suboptimal.  controller  specifies  a  control  input  for  every  x^ . 
The  numerical  implementation  of  the  optimal  JLPC  controller  (as 
in  the  previous  section)  requires  interpolation  between  stored 
values. 

To  illustrate  this  suboptimal  controller  we  apply  it 


to  two  time  stages  (example  9.2)  and  compare  the  resulting  controller  with 
the  optimal  controllers  obtained  in  section  9.8. 

Example  9.5;  (Applying  Suboptimal  Approximation  to  Example  9.2) 

In  the  previous  section  we  found  that  at  time  k=N-l  the  optimal  JLPC 
controller  for  example  9.2  has  five  pieces.  One  of  these  pieces  involves 
a  control  law  that  is  nonlinear  in  Let  us  now  consider  the  subop— 

timal  approximation  of  this  controller  that  is  specified  by  figures  9.5— 
9.7,  9.17.  in  section  9.3  we  obtained  (9.146): 


25 


v  (z  Ir  =1)  =  1  —  z  +  -  z  -  ~z  +  XT  if  ~1<z»<3 

VN  N*  N-l  '  J  8  N  2N  2N  24  N 


13  2  13 

3 


if  3^2 


N 


This  z^-conditional  cost  was  obtained  via  the  steps  of  figures  9.5  -  9.7. 

In  order  to  apply  the  approximation  technique  of  figure  9.17  we  must 

choose  a  set  of  N(z,)  target  values  of  z,,  to  which  the  suboptimal 

N  N 

controller  is  constrained  to  hedge.  In  table  9.9  we  list  a  set  of 

N(z  )  =34  such  values.  Note  that  we  have  included  the  values  of 
N 

—  +  —  + 

z  where  V  (z  r  =1)  is  discontinuous  (ie,  -1  ,  -1  ,  3  and  3  ). 

N  N  N  N- 1 

The  remaining  points  have  been  chosen  arbitrarily. 

Following  the  instructions  of  figure  9.17  we  obtain  the  following 
approximation  of  the  optimal  JLPC  controller  for  example  9.2  at  time 
k-N-1; 

tt:l) 


%-l  (XN-L'  r 


n-i-1’  *  *n-i  *  Vi,talVi  +  Vr 


Vi'Vi'Vi-1’  ‘  'Vi  * 


z 


(x  ,  r  =1)  =  F  (t :  1) 
N-l  N-l  N-l  N-l 


(9.165) 

(9.166) 

(9.167) 


,  (t) 


for  ^N-l(t-1)  XN-1  N-l 


t  =  1, . .  • (1) 

where  the  parameters  in  (9. 165) - (9 .167)  are  as  defined  in  (9. 157)- (9. 162) 
and  the  {^^(tjJare  determined  in  blocks  47  or  54  of  figure  9.17. 

In  table  9.10  we  list  the  parameters  for  this  example  when  the  zN 
grid  of  table  9.9  is  used.  Here  (D  -  25.  That  is,  the  approximate 

controller  has  25  pieces  at  time  k  =  N-l. 


[I] 


Table  9. 


4 

-2.5 

21 

1.5 

5 

-2.25 

:  22 

1.75 

6 

-2 

;  23 

2 

7 

-1.75 

1  24 

2.25 

8 

-1.5 

25 

2.5 

9 

-1.25 

26 

2.75 

10 

'c 

V 

V* 

II 

l 

l 

27 

3" 

11 

j 

j  28 

3+ 

12 

-7.5 

29 

3.25 

13 

-  .5 

30 

3.5 

14 

-  .25 

31 

3.75 

15 

0 

32 

4 

16 

.25  ; 

33 

5 

17 

i 

•5  i 

■  34 

10 

The  N(2  ) 

=  34  target  : 

z„  values 
N 

at  time  k 

rM'u) 

ym’<zV 


N-l  in 


example  9.5. 


target 


choice  of 


1 

30 

782.43 

-15 

-  43.23 

1 

z 

20 

'  350.17 

-10 

-  30.34 

2 

3 

15 

198.98 

-7.5 

-  21.62 

3 

4 

10 

90.380 

*5 

-  16.86 

4 

5 

9.5 

81.949 

-  4.75 

-  13.39 

5 

6 

6 

35.075 

-  3 

-  8.626 

6 

7 

4 

17.824 

-  2 

-  5.140 

7 

a 

2 

7.5437 

-  1 

-  2.627 

8 

9 

1.5 

6.2298 

-  .75 

-  1.949 

:  9 

10 

1.28 

5.8009 

-  .  64 

-  .5153 

10 

11 

0 

5.1414 

0 

1.069 

'  11 

12 

*  .34 

5.5033 

.17 

2.968 

12 

13 

,-  1.9 

10.133 

.95 

5.052 

13 

14 

.  -  2374 

12.528 

1.187 

6.691 

14 

15 

j  -  3.42 

19.526 

1.71 

9.408 

15 

16 

:  -  4.88 

33.262 

2.44 

11.24 

16 

17 

!-  5.07 

j 

!  35.398 

2.535 

12.51 

17 

18 

-  6 

47.032 

3 

13.61 

18 

19 

-  6.3 

51.116 

1 

3.15 

15.42 

19 

20 

-  7.68 

j  72.396 

3.84 

17.75 

20 

21 

-10.28 

118.55 

5.14 

27.02 

22 

22 

-12.13 

,  168.54 

60.65 

28.12 

23 

23 

-14 

!  221.12 

7 

32.02 

24 

24 

-16 

285.16 

8 

36.03 

25 

Table  9.10:  Suboptimal  Controller  of  Example  9.5  at  k  =  N-l 


Comparing  the  hedging  behavior  of  the  optimal  JLPC  controller  (see 

figure  9.16)  and  the  suboptimal  controller  of  table  9.10,  we  see  that 

.  The  optimal  controller  hedges  to  z^=-l”  for  “2. 75  (,  xN_^  ( ■  8125J 

the  suboptimal  controller  hedges  to  ZN=-1~  for 
-2.226  <  xn_l  <  -.3125 

.  The  optimal  controller  hedges  to  ZN=3~  for  8 .0625<xn_1*20  .866 ; 

the  suboptimal  controller  hedges  to  ZN=3~  for  7.762<x  *20.875. 

N-l 

At  time  stage  k=N-l  the  optimal  controller  (see  (9.148))  has  m  (1)=6 

N- 1 

pieces,  one  of  which  is  not  linear  in  xn-1  The  approximate  controller 
has  pieces,  but  all  are  linear  in  .  Of  course,  we  can  re¬ 

duce  the  number  of  pieces  in  the  suboptimal  controller  by  reducing  the 
number  of  grid  points. 

In  table  9.11  we  list  the  optimal  and  approximate  controls,  expected 

costs  and  resulting  z  values  for  various  xVT  ,  values.  Note  that  the 

percentage  of  excess  cost  incurred  using  the  suboptimal  controller  is 

small  for  all  of  these  values.  As  lxN_J  becomes  large,  the  error  will 

become  large  since  the  suboptimal  controller  drives  z%  to  either  -6  or 

+10  (the  extreme  values  of  the  target  grid) .  However  for  x  within  the 

N—  1 

interval  of  interest,  the  suboptimal  controller  is  quite  accurate  at  this 
first  time  stage. 

Let  us  now  apply  the  suboptimal  controller  to  time  stage  k  =  N-2. 

Using  the  approximate  expected  cost-to-go  we  obtain  an  approximate 

% 

V  .  (z.,  ,lr„  _  =  1)  which  has  the  structure 
N— 1  N“ 1 I  N- 2 


VfvJvr1’  -  v^vi  *  v^vi  +vt) Vi  ♦vt)  (9.168) 
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Table 


-15  1 

1 

9.545 

-5.4545 

145.52 

-10 

6.364 

-3.6364 

65.970 

-  s 

3.182 

-1.8182 

18.242 

-  3 

1.909 

-1.0909 

8.0606 

-  2.75" 

1.75 

i- 

7.1458 

-  2. 75* 

1.75 

»- 

7.1458 

-  2 

1 

i  - 

5.0833 

-  1.5 

.5 

_1- 

4.3333 

-  1 

0 

i- 

4.0833 

-  .8125' 

-  .187 

>- 

4.1183 

-  .8125* 

.795 

-  .0178 

4.1176 

-  .5 

.572 

.0717 

3.6906 

0 

.2168 

.2168 

3.2966 

,5 

-  .1357 

.3643 

3.2562 

1 

-  .4858 

.5142 

3.5672 

1.5 

-  .8333 

.6667 

4.2270 

2 

-  1.178 

.8219 

5.2330 

5 

-  3.180 

1.820 

18.368 

8.0625' 

-  5.0625 

3* 

43.712 

8.0625* 

-  5.0625 

3' 

43.712 

10 

1  -  7 

3* 

67.083 

15 

1 

j  -12 

3' 

162.08 

20.866' 

!  -17.866 

3* 

337.28 

20.866* 

'  -15.956 

4.9096 

337.28 

25 

-19.118 

5.8823 

482.28 

9 

-6 

146.33  [ 

.557 

7 

-3 

67.08 

2.774 

3.25 

-1.75 

18.33 

1.68 

2 

-1* 

8.08 

.241 

1.75 

-1* 

7.1458 

0 

1.75 

-1* 

7.1458 

0 

1 

-1* 

5.0833 

0 

.5 

-1* 

4.3333 

0 

0 

-1* 

4.0833 

0 

.187 

-1 

4.1183 

0 

.8125 

0 

4.1182 

.015 

.5 

0 

3.708 

.471 

.25 

.25 

3.300 

.103 

.25 

.25 

3.300 

1.345 

.5 

.5 

3.568 

.022 

.75 

.75 

4.249 

.520 

1.25 

.75 

5.249 

.306 

3.25 

1.75 

18.38 

.065 

5.0625 

3' 

43.712 

0 

5.0625 

3' 

43.712 

0 

7 

3' 

67.083 

0 

12 

3' 

162.08 

0 

17.866 

3' 

337.28 

0 

15.866 

5 

337.33 

.015 

20 

5 

485.6 

.688 

>.11: 


Performance  of  the  optimal  and  suboptimal  controller 


N-l 


at  various  x 


values 


This  approximate  cost  is  computed  via  the  steps  of  figure  9.  7.  It  has 

ip  =*  51  pieces  which  are  listed  in  table  9.12.  Note  that  each  piece  of 

this  approximate  z ^  cost  is  either  quadratic  or  cubic  in  zN_^.  This  is 

* 

similar  to  V„(z„|r  .=1)/  which  has  quadratic  and  cubic  pieces  in  z„  In 

N  N'  N-l  N. 

A 

section  9.8  we  found  that  the  true  zv.  ,  cost  v  i z  ,|r  =1)  has  many 

N-l  N— I  N-l  N-2 

fewer  pieces  than  this  approximate  version  has,  and  several  of  the 
true  cost  pieces  have  complicated  terms  such  as 

[ (-.00625) z„  +  .08489]  )/  (72. 44-5. 33z„  . ) 3 

N  w  N-l 

in  them-  Thus  we  see  that  for  this  example  the  approximate  zVT  .  cost 

N-l 

has  more  pieces,  but  a  simpler  structure.  At  successive  time  stages 

a 

the  true  z^-cost  V  (z^lr^  2=3 )  will  have  even  more  complicated  pieces; 

5?  , 

the  approximate  cost  V  (z  |  r.  =j)  will  always  have  pieces  that  are 

K  K  X—  1 

a 

at  most  cubic  in  z,  .  Note  that  in  this  example  V.  , (z„  , .=1)  is 
k  N-l  N-l  '  N— 2 

continuous  in  z„  .  Therefore  we  have  no  specific  values  to  include 
N— 1 . 

in  the  target  set  of  z„  .  values. 

N—l 

Th  e  next  step  in  the  determination  of  the  suboptimal  controller 

is  the  selection  of  a  set  of  N(z„  .)  values.  In  table  9.13  a  set  of 

N(z„  . )  *=  34  such  values  have  been  arbitrarily  chosen.  Following  the 
N— 1 

Steps  of  figure  9.17  we  obtain  the  following  approximation  of  the 
optimal  controller  for  example  9.2  at  time  k  =  N-2. 


L'Vi'W11  ’  Vi  +  Vilt!llVi  *  sN-2It!l) 


(9.169) 


V2(XN-2'rN-2=1)  "  'XN-2  +  FN-2(t:1) 


(9.170) 


ZN-1 (XN-2,rN-2=1)  =  FN-2(t:1) 


(9.171) 


f°r  V2(t_1)<  XN-2<n  N-2(t) 


t  =  1, . . . ,m  (1) 

N-<£ 
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where  the  parameters  in  (9. 169) -  (9. 171)  are  as  defined  in  (9.157)- 

(9.162)  and  the  N  ^{t)  are  blocks  47,  54  of  figure  9.17. 

In  table  9.15  we  list  the  parameters  for  this  example  when  the  z^ 

grid  of  table  9.13  is  used.  Here  in  „(1)  =  33.  That  is,  the  suboptimal 

approximate  controller  has  33  pieces  at  time  k  =  N-2.  The  suboptimal 

controller  described  in  table  9.14  has  almost  three  times  as  many  pieces 

as  the  optimal  controller.  However  all  of  the  suboptimal  controller 

pieces  are  linear  control  laws  in  x^  .  Many  of  the  optimal  control 

law  pieces  are  difficult  or  impossible  to  obtain  analytically. 

In  section  9.8  we  obtained  numerically  the  optimal  control  law, 

expected  cost  and  attained  ,  value  for  each  item  in  a  set  of  x.T  „ 

N-l  N— 2 

values  (see  table  9.6).  In  table  9.15  we  compare  the  suboptimal 

controller's  behavior  at  these  .  values.  Note  that  with  the  34 

target  values  that  we  have  arbitrarily  chosen  (in  table  9.13),  the 

resulting  suboptimal  has  large  errors  for  all  x„  _  values  (20%-30%). 

N-2 

Thus  the  error  at  the  second  time  stage  is  an  order  of  magnitude 
greater  than  at  k  ■  N-l.  In  order  to  reduce  this  error  we  can 
increase  the  density  of  the  grid  ,  but  this  increases  the  com¬ 

plexity  of  the  approximate  controller. 

The  approximate  controller  described  in  this  section  is  a 
brute  force  approach  to  obtaining  an  approximation  of  the  optimal 
JLPC  controller  that  has  an  easily  implementable  structure.  As 

we  have  seen,  this  approximation  is  prone  to  large  errors  after 
only  two  time  steps.  We  have  not  considered  more  successful 

approximation  methods  in  detail  here. 
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Table 


1 

... 

2.55 

3 

28.7325 

2 

1 

i 

2.3625 

-2.3906 

-10.0126 

3 

;  1 
! 

2.55 

1.5 

10.17 

4 

1  ~ 

2.5344 

1.1906 

8.6386 

5 

2.5188 

.9025 

7.3103 

6 

2.5031 

.6363 

6.2074 

7 

... 

2.4875 

.3906 

5.2421 

8 

I  “ 

2.4719 

.1694 

4.4589 

9 

2.4563 

-  .0331 

3.8032 

10 

; 

2.4719 

.1513 

4.3471 

11 

2.4563 

-  .0294 

3.8248 

12 

... 

2.4719 

.1338 

4.2505 

13 

2.4563 

-  .0254 

3.8451 

14 

i 

2.4719 

.1158 

4.1643 

15 

i 

2.4875 

.2364 

4.3971 

IS 

| 

i 

2.5031 

.3346 

4.5512 

17 

i 

j 

2.487S 

.1705 

4.2133 

18 

i 

2.5031 

.2479 

4.3093 

19 

j 

2.4875 

.1756 

4.2255 

20 

i  ... 

i 

2.5031 

.2313 

4.2751 

21 

i  ... 

i 

2.4875 

.1855 

4.2416 

22 

! 

j 

2.5031 

.2197 

4.2603 

23 

|  .0917 

2.9906 

| 

1.5157 

5.1605 

24 

.0917 

2.9438 

1.4559 

5.1414 

25 

.0917 

2.8969 

1.4711 

5.1401 

26 

.0917 

2.85 

1.5607 

5.0973 

.12,  Part  I;  Parameters  for  approximate  zN_. 
l-l(ZN-l'rN-2=1)  in  example  9'5' 


target 


V2<XN 

piece#  t 


choice  of 


.-2-V2*1 

G(t:l) 

1 

!vF(t:1) 

Vi(c) 

_ _ 

1 

30 

782.43 

-15 

-  43.23 

1 

2 

20 

350.17 

-10 

-  30.34 

2 

3 

IS 

198.98 

-  7.5 

-  21.62 

3 

4 

10 

90.380 

-  5 

-  16.86 

4 

5 

9.5 

81.949 

-  4.75 

-  13.39 

5 

6 

6 

35.075 

-  3 

-  8.626 

6 

7 

S  4 

17.824 

-  2 

-  5.140 

■  7 

8 

;  2 

7.5437 

■  -  1 

!  -  2.627 

;  8 

9 

(  M 

6.2298 

-  .75 

-  1.949 

t 

;  9 

10 

1.28 

5.8009 

-  .64 

-  .5153 

j  10 

11 

0 

5.1414 

a 

1.069 

f  I’ 

12 

-  .34 

f 

5.5033 

.17 

!  2.968 

12 

13 

i  -  1.9 

10.133 

.95 

5.052 

’  13 

14 

|-  2374 

12.528 

1.187 

6.691 

14 

15 

j  -  3.42 

19.526 

1.71 

9.408 

;  15 

16 

'  -  4.88 

33.262 

2.44 

11.24 

i ,6 

17 

j  -  5.07 

35.398 

2.535 

12.51 

!,  17 

18 

;  -  6 

47.032 

3 

13.61 

i  18 

19 

-6.3 

51.116 

3.15 

15.42 

!•  19 

20 

-  7.68 

72.396 

3.84 

17.75 

20 

21 

-10.28  | 

118.55 

5.14 

27.02 

22 

22 

-12.13  j 

168.54 

60.65 

28.12 

23 

23 

-14 

221.12 

7 

32.02 

24 

24 

•16 

285.16 

8 

36.03 

25 

25 

•18 

357.23 

9 

40.12 

26 

26 

-20.13 

442.68 

10.065 

48.59 

27 

27 

-26 

727.87 

13 

59.29 

28 

28 

-30 

965.03 

IS 

74.58 

29 

29 

-37.732 

1541.7 

18.866 

93.34 

30 

30 

-45.732 

2288.4  | 

22.866 

101.5  ' 

31 

31 

-50 

2721.7 

J 

25 

117.5  ; 

32 

32 

-60 

389  7.1 

30 

160. 1  | 

33 

33 

-70 

5497.8 

35 

1 

34 

Table  9.14:  Suboptimal  controller  of  example  9.5  at  time  k 


OPTIMAL  CONTROLLER 


SUBOPT I MAL  CONTROLLER 


*N-2 

1 

i 

'n-j 

VN-2 

1 

ZN-J 

V: 

<  \  mere***.  in 
i  cose  of  subopciMl 
controller 

3.4756 

1  -  2.476 

1 

13.463 

-  2.526 

.95 

15.61 

15.94 

4.0628 

'  -  2.876 

1.187 

16.605 

-  3.113 

.95 

18.92 

13.94 

4.2743 

-  3.024 

1.25 

17.849 

-  3.324 

.95 

20.28 

13.61 

6.8866 

-  4.887 

2 

3S. 491 

-  5.177 

1.71 

43.40 

12.75 

10.639 

:  -  7.639 

3 

85.44 

-  8.199 

2.44 

94.53 

10.64 

10.662 

1  -  7.662 

3 

35.79 

-  3.222 

2.44 

94.91 

i 

;  10.53 

14.229 

-10.229 

4 

149.62 

-11.079 

3.15 

163.9 

|  9.57 

17.792 

-12.792 

5 

231.64 

-12.652 

5.14 

252.2 

j  8.88 

21.351 

-15.351 

6 

331.79 

-16.211 

5.14 

354.9 

!  6.97 

1 

21.582 

-15.517 

6.065 

338.93  | 

-16.44 

5.14 

362.5 

l 

1  6.95 

24.916 

-17.916 

7 

450.29  | 

-19.78 

5.14 

483.2 

j 

28.504 

-20.50 

a 

588.12  j 

21.50 

7 

634.5 

■  7.39 

35.987 

-25  n 

10.065 

935.52 

-27.99 

8 

1004.4 

7.37 

40 

-28.84 

11.164 

1154.4 

-32 

8 

1245.2 

7.77 

50 

-36.10 

13.904 

1804.7 

-37 

13 

1927.9 

6.82 

68.042 

-49.19 

18.847 

3343.5 

-53.04 

15 

3553.4  i 

6.  t9 

68.10' 

-49.24 

18.863 

3349.2 

-53.1 

15 

3559.6 

6.28 

68. 10+ 

-49.22 

18.882 

3349.2 

-53.1 

15 

3559.6  j 

6.28 

68.111 

-49.23 

18.885 

3350.3 

-53.1 

15 

3560.1  i 

6.28 

70 

-50.58 

19.422 

3538.8 

-55 

15 

3765.0  i 

j 

6.39 

75 

-54.15 

20.849 

4062.5 

-56.13 

18.87 

4336.8 

6.75 

82.089 

-59.20 

22.886 

4866.1 

-63.22 

18.87 

5182.9 

6.51 

90 

-64.93 

25.07 

5848.3 

-71.13 

18.87 

6245.8 

6.80 

100 

-72.14 

27.86 

7219.0 

-81.13 

18.87 

7768.5 

7.61 

Table  9.15  ,  Part  Is  Performance  of  the  optimal  and  suboptimal  controller 
at  various  x ,  values. 
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F/G 

NST  OF  9/10  1BI 
VSTEMS 

/2  NL  1 

L, _ i 

i. _ : . . . . . . . j 

OPTIMAL  CONT ROLLS* 


SUSOPTUIAL  OOmOLLCR 


l*-2 

Vi 

*h-j 

Vj 

“n-7 

*M-2 

Vi 

%  Increase  in 

cost  of  suboi>tlaal 

•20 

11.2167 

-8.7833 

227.806 

12.5 

-  7.5 

298.98' 

controller 

31.24 

-12 

6.7300 

-5.2694 

84.213 

.9 

-  3 

107.075 

27.15 

-10.817 

6.0665 

-4.75 

59.075 

7.817 

-  3 

87.130 

26.21 

-10 

5.7938 

-3.933 

60.032 

7 

-  3 

75.075 

25.06 

-  8 

4.4650 

-3.S3S 

36.004 

6 

-  2 

49.824 

38.38 

-  7 

3.8861 

-3.1139 

31.040 

5 

-  2 

38.824 

25.08 

-  6.278 

3.4651 

-2.813 

25.732 

4.278 

-  2 

32.125 

24.85 

•  5.S81 

3.0809 

-2.5 

21.168 

3.581 

-  2 

26.648 

25.89 

•  3.344 

1.8443 

-1.5 

10.151 

2.344 

-  1 

12.038 

18.59 

-  2.243 

1.2337 

-1.010 

6.763 

1.493 

-  .75 

7.896 

16.76 

-  2.233 

1.2275 

•1.005 

6.73 

1.483 

-  .75 

7.867 

16.89 

-  2.233* 

1.2379 

-  .996 

6.73 

1.483 

-  .75 

7.867 

16.89 

-  2.114 

1.1644 

-  .95 

6.482 

1.364 

-  .75 

7.528 

16.13 

-  1.984 

1.0838 

-  .90 

6.158 

1.234 

-  .75 

7.190 

16.76 

-  1 .687 

.8371 

-  .75 

5.395 

.947 

-  .64 

6.288 

16.55 

-  .9117 

.4117 

-  .5 

4.551 

.2717 

-  .64 

5.465 

20.09 

-  .2204 

•  .0296 

-  .25 

4.287 

.2204 

0 

5.110 

21.06 

-  .0802 

-  .1198 

-  .2 

4.308 

.0802 

0 

5.148 

19.49 

.2021 

-  .3021 

-  .1 

4.427 

-  .2021 

0 

5.182 

17.06 

.4889 

•  .4869 

0 

4.651 

•  .4869 

0 

5.378 

15.64 

1.2102 

-  .9602 

.25 

5.697 

•  1.04 

.17 

6.556 j 

15.09 

1.9493 

•1.449 

.5 

7.478 

-  1.779 

.17 

8.6401 

15.54 

Table  9.16,  continued 


0,10 


In  this  chapter  we  have  extended  the  solution  methodology  of  chap¬ 
ters  5-8  to  encompass  jump  linear  control  problems  that  involve  additive 
input  noise  in  the  x-dynamics.  The  Dresence  of  additive  incut  noise  pro¬ 
foundly  changes  the  nature  of  the  optimal  controller  in  that  it  is  not 
possible  to  use  the  control  input  to  drive  the  x  process  into  a  specified 

interval  of  values  with  certainty , 

In  extending  the  methodology  of  chapter  8  to  include  input  noise  we 

lose  the  piecewise-quadratic  nature  of  the  optimal  controller.  We  have 

therefore  relaxed  the  restrictions  on  the  k-operating  costs,  x-terminal 

costs  and  from  transition  probabilities  (requiring  only  a  finite  number 

of  convex  or  concave  pieces)  because  the  piecewise-quadratic  structure  of 

the  optimal  controller  cost  is  lost  in  any  event. 

In  sections  9.2  -  9.7  we  formulated  the  general  JLPC  control  pro¬ 
blem,  obtained  a  general  one-step  solution  procedure,  investigated  the 
qualitative  properties  of  the  optimal  one-step  solution,  and  presented  an 


algorithm  (flowchart)  for  the 


computation  of  the  optimal  control¬ 


ler  for  finite  time-horizon  problems.  This  algorithm  was  applied  to  two 
time  stages  of  an  example  in  section  9.8.  In  section  9.9  we  developed  a 
suboptimal  JLPC  approximation  that  results  in  controllers  which  have  piece 
wise-linear  control  laws  (in  x,  ) ,  at  all  times  k»N-l,..,k  .  ThJs  suhoo- 
timal  controller  was  applied  to  the  example  o.f  section  9.8  and  the  optimal 
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and  suboptimal  controllers  were  compared. 

In  the  next  chapter  we  will  consider  the  application  of  the 
methodology  of  this  thesis  to  jump  linear  control  problems  poss¬ 
essing  n-dimensional  states  with  x-dependent  and  u-dependent  form 
transitions  and  having  form  controls.  These  problems  can  be 
addressed  using  approximations  similar  to  the  approximation 
technique  of  section  9.9. 
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10.  JUMP  LINEAR  CONTROL  PROBLEMS  WITH  NONSCALAR  x  AND  CONTROL- 
DEPENDENT  FORM  TRANSITIONS 
10.1  Introduction 

In  part  III  and  chapters  3-9  of  this  thesis  we  have  develooed 
and  applied  a  basic  approach  for  the  solution  of  optimal  jump  linear 
feedback  control  problems.  This  methodology  is  based  upon  the  following 
tactic: 

.  at  each  time  stage  k  and  from  each  form  r^=j  #  the  control  prob¬ 
lem  is  broken  up  into  a  set  of  constrained  subproblems  that  are 
relatively  easy  to  solve. 

.  then  the  solutions  of  these  constrained  subproblems  are  compared 
at  each  value  to  determine  the  optimal  controller. 

In  this  chapter  we  will  briefly  consider  how  this  general  solution  ap¬ 
proach  can  be  applied  to  several  other  classes  of  jump  linear  control 
problems.  Specifically,  we  will  examine 

.  JLPC  problems  involving  nonscalar  x  process  (Section  10.2) 

.  JLPC  problems  involving  u-dependent  form  transitions  (Section 
10.3) 

and 

.  JLPQ  (and  JLPC)  problems  where  the  form  process  can  be  directly 
or  indirectly  controlled  by  a  separate  form  control 
(Section  10.4). 

Combining  these  extensions,  we  obtain  the  general  control  problem  for¬ 
mulation  that  was  introduced  and  motivated  in  detail  in  chapter  1.  Our 
motivation  for  addressing  these  issues  here  is  to  demonstrate  that 


the  basic  idea  of  this  thesis  can  potentially  be  applied  to 


more  realistic  fault-tolerant  optimal  control  problems. 

As  we  will  see,  each  of  these  problems  represents  a  generalization  of 
the  results  of  chapters  5-9  that  involves  an  increase  in  com¬ 
plexity  in  both  the  optimal  controller  derivation  and  the  implementation 
of  the  resulting  control  laws.  However  the  basic  solution  approach  of 
Part  III  remains  valid  for  these  problems  and  it  yields  methods  for 
their  solution. 


10 . 2  Nonscalar  x  Processes 

When  the  x-process  in  a  JLg,JLPQ  or  JLPC  problem  is  n-dimensional , 
the  basic  solution  idea  of  dividing  the  problem  into  constrained 
subproblems  is  still  a  valid  one.  However  there  are  fundamental  dif¬ 
ferences  in  the  resulting  subproblems. 

The  basic  difference  involves  the  nature  of  the  subproblem  con¬ 
straints.  Recall  that  in  the  problems  of  chapters  5-9  the  optimal 


JLPC  control  problem  is  converted  into  the  comparison  of  a  set  of 
subproblems  constrained  in  x^+^  (or  in  z^+^  for  noisy  JLPC  problems)  y 

which  is  determined  by  the  control  u^  when  x^  and  r^*j  are  known.  The 

A  j 

’i'k+1  different  subproblems  involve  constraining  z^+1  to  be  in  a 
certain  interval 

Ak+l{t)“(yk+l(t"1) '  Yk+l(t)) 

of  values  on  the  real  number  line.  Thus  the  optimal  solution  of  each 
subproblem  either  places  2k+^  in  the  interior  of  the  constraint  inter¬ 
val  &k+^(t)  or  on  one  ^at  most  two^  boundary  points,  when 


the  x  process  is  nonscalar,  the  constraint  set  'AJ  W,J 

^k+l^  it-i,  ..,vk+1 

for  each  subproblem  is  not  an  interval  on  the  real  line  but,  rather, 


the  constraint  set 


a  surface  in  n-space.  Iff  for  some  (xk»rk=^  >  the  constraint  in  a  par¬ 
ticular  subproblem  is  active,  we  must  determine  the  best  location 
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for  on  the  constraint  surface.  Thus  we  have  uncountably  many 

points  to  consider  (instead  of  two) . 

A  second  complicating  factor  in  the  consideration  of  n-dimen- 
sional  x-processes  in  JLQ,  JLPQ  or  JLPC  problems  relates  to  the  com¬ 
parison  of  subproblem  solutions  (so  as  to  determine  the  optimal  con¬ 
trol  law  for  each  x^.  In  the  scalar  x  case  this  involved  finding  the 

intersections  of  convex  functions  of  x.  Using  the  monotonicity  prop¬ 
erty  of  the  mapping 

(V  V  j)  * - ♦  Vi 

in  the  scalar  case  (as  in  Proposition  9.6),  only  one  such  intersection 
needed  to  be  found  for  some  (but  not  all)  pairs  of  valid,  eligible 
candidate  costs.  For  nonscalar  x^  we  must  find  the  curves  described 

by  the  intersection  of  valid,  eligible  candidate  cost  surfaces.  This 
comparison  was  greatly  simplified  for  scalar  problems  by  the  ordering 

of  the  real  line.  For  n-dimensional  problems  the  lack  of  as  simple 
an  ordering  in  n-space  complicates  matters.  For  some  classes  of 

problems  there  may  be  special  orderings  that  facilitate  detailed 

-  •  •  w 

analysis.  In  particular  ideas  such  as  the  endpiece  and  middlepiece 
(in  scalar  problems)  appear  to  have  natural  extensions  in  nonscalar 
problems.  Some  of  the  difficulties  involved  with  the  analysis  of 
these  problems  are  illustrated  in  the  following  example: 


3 


These  difficulties  are  illustrated  in  the  following  example* 


Example  10.1;  Consider  the  following  control  problem: 

f  N  ^ 

T2 

(u  'R(r  )u.  +  x'  Q(r  )x  ) 
k=l  k  k  k  k+1  k+1  k+1 


min  E^ 

uo - 'Vl 


where 


R(j)>  0,  Q(j)  >  0,  Q  ( j )  >  0  for  j  =  1,2  and 

—  N  — 


(10.1) 


A  (r.  )  x,  +  B  (r.  ) 


k)Uk 


=  i,  x. 


*k+l  “  A(rk,Xk 
Prob{rk+1  =  j  1 

rk£{l,2} 

with  form  transition  probabilities 

p(2 .1) =0 

p(2,2)*l 
p(l,  2:x)  = 


x}  =  p (i ; j :x) 


K 


if  x'Sx<a 


(10.2) 


(10.3) 


(10.4) 


if  x'Sx>a  , 

where  s=S'  *0  is  an  n»n  matrix.  In  figure  10.1  we  show  the  two  form 
transition  probability  pieces  in  (10.4).  Here  p(l,2:x)*\^  if  x  is 
inside  an  n-dimensional  ellipsoid  centered  at  zero.  For  convenience 

we  will  assume  that  the  x  process  has  dimension  n=2. 

A  A 

The  conditional  expected  cost-to-go  V„(x„ir„  =1)  *  Vv,(z  lr  =1) 

N  N  f  N— 1  N  N*  N— 1 


is  (here  z  *  x„  since  there  is  no  noise) : 

N  N 

»  <  1  (  x'K  (Dx  if  x'Sx  <  a 

V  *N  lVl  '  *  \  »  »  "  "  " 


x'K(2)x 
N  N  N 


if  x'Sx  >  a 
N  N 


(10.5) 


744 


where 


(10.6) 


KN(i)  =  (1-XJ  (Q(l)  +  QJD)  +Xi(Q(2)  +  QN  ( 2) ) 

for  i  =  1,2. 


If  we  assume  that 

A  A 


V1’  '  V21 


(10.7) 


(as  in  the  commensurate  goals  problem  of  chapter  7)  then  vN((xNJrN_^=1) 

has  the  general  shape  that  is  shown  in  figure  10.2.  Note  that  this 

function  is  discontinuous  for  any  x„  such  that  x’Sx  *  a  • 

N  N  N 

We  can  rewrite  (10.1)  -  (10.6)  at  time  k=N-l  as  the  comparison  of 
two  constrained  subprob lems : 


Vi(Vi'rH-i->  -  11  t! 


(10.8) 


where 


Vl,Vl'rN.l'tl  *  "in 


u. 


N-l 


s.t. 


W*1 


Vi  *,1)  Vi 


VVt>XN 


J 


(10.9) 


where  the  constraint  regions  are 


V11 


V2’ 


txN  !  <  ®  5 

,XN  !  XiSXN  *  “2  } 


(10.10) 


If  the  two  conditional  cost  matrices  KN(1)  an<^  Kn^2^  are  sca^ar  mult~ 
iples  of  each  other,  then  the  ellipses  described  by 


XN  V2)XN  “  Y2 


x'  K„(l)x 


'N  *'N  ' '  "N  t1 

in  figure  10.2  will  be  oriented  along  the  same  axes  in  the  x-plane.  In 
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x'kjdx 


ILIMS  X'.. X.. 


ELLIPSE  X’MK(itXN»X, 


a  ..  irt 


Figure  10.2;  Conditional  expected  cost  vN(xNlrN_i*  in  example  10.1 
when  Kjjd)  <^,(2).  The  upper  "bottomless  cone"  is  *^(2) xn  •  The 
lower  cone  is  x'K  (l)x  ,  which  applies  when  x'Sx  <a2. 


this  case  the  optimal  controller  will  have  the  three-part  structure  that 
is  shown  in  figure  10.3.  For  x  in  the  ellipsoidal  region  <$„  ,  (1)  , 
the  optimal  controller  places  xN  in  An (1) .  For  x  in  the  ellipsoidal 

ring  £ 2 ) ,  the  optimal  controller  places  somewhere  on  the  ellipse 


in  figure  10.2.  Here  the  controller  is  hedging  to  this  smooth  curve. 

For  xm  i  in  tb®  optimal  controller  places  x„  in  A.,(2)  .  If 

wx  w— iv  ^  /\  N  N 

K  ( 1 )  K  ( 2 ) 

the  conditional  cost  matrices  nv  and  N  are  not  scalar  multiples 

of  each  other,  then  the  domains  of  the  optimal  controller  at  k  =  N-l 
may  take  any  ot  the  shapes  in  figure  10.4. 

A 

At  the  next  time  stage,  the  conditional  cost  VN  ^XN-l»ru-l=2^  wil1 
have  a  domain  similar  to  those  in  figure  10.4  even  if  figure  10.2 
applies  (unless  the  ellipse  x* Sx  is  aligned  with  the  x-plane  boundary 
ellipses  in  figure  10.3) .  Thus  for  all  but  the  most  trivial  cases,  the 
shapes  of  the  optimal  controller  domain  (in  the  x  plane)  will  vary 
greatly  (and  be  difficult  to  describe,  in  general)  as  (N-k)  increases. 


Figure  10.4/  Three  other  VN.]_  domain  shapes  in  example  10.1 

if  the  conditional  cost  boundary  ellipses  are  not  aligned,  in  each  case 

2 

the  region  5„  (2)  corresponds  to  hedging  to  the  curve  x'Sx„  =  a  . 

N- 1  N  N 


Example  10.1  suggests  that  it  is  extremely  difficult,  in  general,  to 
obtain  the  solutions  of  n-dimensional  versions  of  the  problems  of  chapters 


7-9.  However  for  certain  subclasses  of  these  problems  with  special 
structures  we  can  obtain  greater  insight  into  the  nature  of  the  optimal 
controller.  One  such  class  consists  of  JLQ  (or  JLPQ)  problems  with 
scalar  x  and  u  and  form  transition  probabilities  that  are  piecewise- 
constant  in  u  and  x.^  As  we  will  discuss  in  the  next  section,  such 
problems  are  solvable  with  a  relatively  minor  modification  of  the  sol¬ 
ution  algorithm  of  chapters  7  and  8.  The  analysis  of  other  special 
classes  of  n-dimensional  problems  for  which  the  x-dependent  problem 
is  solvable  is  a  topic  for  future  investigation. 

We  can  also  think  of  obtaining  approximations  of  the  optimal  con¬ 
troller  for  n-dimensional  problems.  One  way  to  do  this  would  be  to 
use  the  approach  of  section  9.9.  Basically,  this  would  involve  carrying 
out  the  following  tasks  at  each  time  stage  k  (and  in  each  form  j): 

1.  Compute  (by  numerically  integrating  over  x  )  the  x^- 

. 

conditional  cost  surface  (xJt+1|  r^=j) .  This  is  done 
by  generalizing  the  steps  of  figures  9.6-9. 7  to  account 
for  the  n-dimensional  x. 

2.  Compute  (numerically)  the  z  -conditional  cost  surfact 

js  K+1 

^  . 

Vk*l {zk+l'rk  =j)  at  (only)  a  set  of  target  z^+^  values. 

As  in  section  9.9,  these  target  values  are  chosen 
arbitrarily.  Finding  an  intelligent  way  to  choose  them 
is  an  open  question, 

3.  For  each  x^  value  of  interest,  compute  the  cost  incurred 


These  systems  can  be  thought  of  as  two-dimensional  x  problems  by 
augmenting  x  with  the  control. 


if  the  controller  drives  to  one  oz  the  chosen 

target  values. 

4.  Select  as  the  optimal  control  law  at  each  of  interest 
the  control  in  step  3  that  results  in  the  lowest 
expected  cost- to- go. 

This  approximation  scheme  can  be  used  (in  principle)  to  obtain  an 
approximation  of  the  optimal  controller  for  any  specific  problem  of 
interest  .  It  does  not  offer  us  much  insight  into  the  qualitative 
properties  of  the  optimal  controller,  however.  The  investigation  and 
analysis  of  the  general  nonscalar  x  problem  (or  special  classes  of  it) 
remains  a  topic  for  future  investigation. 

10. 3  Form  Trams itions  that  are  u-Dependent 

From  a  practical  standpoint,  control-dependent  form  transitions 

are  an  important  part  of  the  overall  fault-tolerant  control  problem. 

A  large  class  of  component  failures  are  related  to  dynamic  control 

input  choices.  Indeed,  actuator  -dependent  failures  are  a  major  source 
of  difficulty  in  many  transportation,  military,  electric  power  and 
communication  systems. 

We  have  deferred  examining  this  class  of  problems  because  unless 
there  is  no  x-cost  in  the  problem  formulation ,  after  one  time  step  the 
optimal  expected  cost-to-go  (and  conditional  ejected  costs)  will  have 
a  piecewise  structure  in  x  (as  well  as  in  u)  if  the  transition 
probabilities  are  piecewise  in  u.  When  both  x  and  u  are  scalar, 
however,  this  does  not  present  a  serious  difficulty.  Conceptually, 
such  problems  are  basically  the  same  as  in  the  scalar-x,  x-dependent 


only  case. 


Consider  form  transition  probabilities 

Pr{rk+1=j  |  rk=i,xk+^=x,uk=u) }=  p(i,j;x,u)  (10.11) 

that  are  pieces iwe- constant  in  x  and  u.  At  each  time  step  the  optimal 
expected  cost-to-go  (x^.r^j)  is  given  by 

Vlc'xJc'JrJc“;i)  ‘ min 

“n 


\R(J) 


l|v] 


(10.12) 


where  the  conditional  expected  cost-to-go  is  parameterized  by  both 
xk+1  and  u^.  The  minimizations  in  (10.12)  is  solved  by  breaking  up 
the  problem  into  a  collection  of  subproblems,  each  correspondint  to 
keeping  x^+1  in  a  certain  interval  and  simultaneously  keeping  u^  in 
a  certain  interval.  The  solutions  of  these  subproblems  are  piecewise- 
quadratic  in  x^,  with  unconstrained  pieces  (resulting  from  u^  in  the 
interior  of  the  u-constraint  interval  in  question  and  the  resulting 

also  in  the  interior  of  its  constraint  interval)  and  the  subprob¬ 
lems  also  have  actively  constrained  pieces  (where  either  u^  or  xk+1 
(or  both)  is  at  a  constraint  boundary  point).  The  effect  of  the 
constrains  in  u^  is  to  complicate  the  book-keeping  regarding  the  inter¬ 
vals  of  x^  values  where  each  subproblem  solution  cost  function  piece 
is  valid.  Nevertheless,  we  obtain  the  optimal  cost  V  (x^jr^j)  and 

control  u.  (x  ,r  by  conparing  a  collection  of  pi ecewis e-quadratic 
K  k  k 

in  x^  cost  functions,  as  in  chapters  5-9.  We  consider  the  following 
simple  example  that  illustrates  the  effect  of  having  u-dependent 


form  transitions: 


Jxamp 1 e  10.2: 
Let 


Vi  =  *k  +  \ 


if  r,  =1 
k 


\vi=2xk  *  \ 


i£  rfc  -2 


p(l,2:u) 


1/4  ju^l 

3/4  |u|>l 


p  (1 , 1  :u)  *  1-  p  (1 ,  2  :u) 


p(2,?)=l  p(2,l)-0 


where  we  minimize 


l<<  ♦  4> 


U  r  •  •  •  /  \1 
O  N-l 


*N  VVJ 


(10.13) 


with  terminal  conditions 


K^d)  *  0  Kt<2)  -  3 


This  is  similar  to  example  5.1  (which  was  modified  in  chapters  6,7,8 
and  9) ,  except  that  the  form  transition  probability  is  piecewise- 
constant  in  u. 


In  form  r,  *  2  we  have 
k 

w v2>  -  \K  ii21 


(10.14) 


where 


, (2)  -  3 


4IIW2)  * 11 
2  *  W2) 


(10.15) 


as  in  earlier  versions  of  this  example  since,  once  the  system  enters 


form  r  =  2  it  cannot  leave. 


We  also  have  that 
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(10.16) 


VN-l(xN-l'rN-l=1|  lUN-ll<1)  =  uiin  ^  +  (7/4)  |  (10.20) 

W-ll*1 

For  JuN_  ^|  >  1  we  have 

Vi  ‘  "in  \  Vi  *  113/4Hii  *!ViVi  *Vi  1  <10-211 

TJ-1  J 

Differentiating  with  respect  to  and  setting  to  zero  we  find  that 

the  unconstrained  solution  to  (10.21)  is 

'Vi  =  “(13'17)xN-1  (10.22) 
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4:“  -  (13/17>  4-i 


(10.23) 


This  is  valid  if 

I  xN_L  |  >  (17/13) 

(ie,  in  this  case  we  have  |u  |  >  1).  if 

1  N— 1 

{Vi*  -(17/13) 


(10.24) 


(10.25) 


we  can  drive  the  system  to  the  constraint  boundary 


obtaining 


UN-I  =  1_ 


2,+ 

XN  =  Vl  +  1 


with  the  cost 


VN-1  =  (13/4)Vl  +  13  Vl  +(15/2) 


(10.26) 


(10.27) 


(10.28) 


Alternatively  we  can  drive  the  system  to  the  constraint  boundary 


2~  !  + 

Vi  ~  _1 


(10.29) 


obtaining 


with  the  cost 


VN_i  =(13/4)Xj^_^  +(15/2) 


(10. 30) 


(10.31) 


Similarly  if  \  C  we  have 

4:?  -  -<7/ii)vi 


(10.32) 


44  =  (’/in  Vi 


which  is  valid  for 


I  Vl  I  < (11/7)  • 


(10.33) 


(10.34) 


Vi  ^(11/7) 

we  can  drive  to  the  boundary 

u^+  =  l+ 

“n-1 

obtaining 

x1#+  -  x  +  1+ 

V-1 

with  the  cost 

vi:l-(7/4,4-i+7vi  t(9/2» 


if 


XN-1  £  -(11/7) 


we  can  drive  to  the  boundary 

u1'-  =  -1~ 

N-l 

obtaining 

XN  =  *N-1  "  1 


with 


V»-I  *  «7',4lvr7Vi*(9/2) 


(10.35) 


(10.36) 


(10.37) 


(10. 38) 


(10.39) 


(10.40) 


(10.41) 


(10.42) 


Not*  that  all  six  of  the  above  candidate  cost-to-go  pieces  are 
quadratic  in  x„  , ,  and  are  defined  over  regions  of  validity  in 
terms  of  x^  ^  (and  not  u^  ^) .  Thus  once  these  candidate  costs  and 
their  regions  of  validity  are  established  we  can  find  the  optimal 
controller  by  finding  the  intersections  of  quadratic  functions,  as 
in  chapters  5-8.  In  figure  10.5  we  show  the  regions  of  validity  for 


each  of  the  candidate  cost  functions  in  this  example. 


Since  (13/17)  ^(7/11)  we  have 


V2,U  S  V1'^  for  all  x 

N-l  '  N-l 


N-l  *  0. 


Thus  V  '  is  optimal  over  the  interval  (-11/7,11/7).  For  x 

N- 1  N- 1 

sufficiently  large,  V2'^  will  be  optimal.  We  find  that  v2'?  and 

N- 1  N- 1 

V  intersect  in  (-*,-11/7)  at 
N—  1 

XN_1  =  -6.3897. 


Consequently  the  optimal 

controller  at 

time  k 

*  N-l  for  this  example  is 

/ 

^  v2'u 
N-l 

for 

xN_1> -6.3897 

V1'" 

N-l 

for 

-6.  3897  £  xM  .  <r-ll/7 

N— 1 

wvi'Vi*11  ■  < 

v1'0 

N-l 

for  - 

-1.571  =  -11/7  ^xM  .<  11/7 
N—  1 

v  i,+ 

N-l 

for 

11/7  ^xm  .  ^6.  3897 

N-l 

2.U 

V 

N-l 

for 

Vi  >  6-3897 

(10.43) 

The  optimal  expected  cost  is  thus  piecewise-quadratic  in  x^^.  At  the 

next  time  stage  we  will  have  a  conditional  cost  that  is  dependent  on 

xN_!  and  UN_2*  the  resulting  optimal  cost  VN_2  (XN_2  *rN_2*1)  wil1 

be  piecewise-quadratic  in  x„  _.  Thus  at  time  k=  N-2  (and  thereafter) 

N— 2 

the  optimal  controller  must  take  into  account  both  u  constrants  and 
x  constraints. 

In  this  section  we  have  not  investigated  the  qualitative  properties 
of  the  control-dependent  JLQ  problem  in  detail.  However  it  is  reasonable 
to  assume  that  results  similar  to  those  of  chapters  5-6  and  8  are 


accessible.  One  important  area  for  future  research  is  an  examination  of 
how  controllers  involving  piecewise-constant  u-dependent  form  transitions 


differ  from  x-dependent  problems.  One  difference  is  immediately 
apparent  from  the  above  example:  the  optimal  controller  here  hedges  to 
values  of  u,  as  well  as  to  values  of  x  .  Thus  for  the  scalar  case, 
the  optimal  controller  can  hedge  to  either  points  or  lines  in  the 
two-dimensional  u-and-x  space. 

10 . 4  JLPC  Problems  with  Form  Controls 

In  this  section  we  consider  JLQ,  JLPQ  and  JLPC  problems  where 
the  form  process  can  be  controlled.  The  optimal  controller  chooses, 
at  each  time  k,  between  a  finite  number  of  control  options.  These 
options  either  entail  changing  form  transition  probabilities  ("in¬ 
direct  from  control"  in  the  terminology  of  chapter  1)  or  deterministi¬ 
cally  switching  between  forms  ("direct  form  control") .  The  form  con¬ 
trol  decision  is  made  using  observations  of 

It  is  assumed  that  the  form  would  be  x-independent  if  no  such  controls 
were  applied. 

Through  the  use  of  forin  controls,  we  can  endow  the  JLQ  controller 
with  the  active  hedging  behavior  that  is  one  attribute  of  fault-tol¬ 
erant  control  systems. 

Let  us  consider  the  following  problem  formulation  (as  in  (1.1)— 

(1.10)): 
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A(rk)xk  +  B(rk)uk  +  H(rk)vk 


(10.43) 


Prob  {r  j  |  r.  *  i,a.  *  q  }  *  p(i,j;a) 


(10.44) 


A(rk'rk+l)xk+l  +  Z(rk'rk+1} 


(10.45) 


q^eL^  =  {1,2,...,L}  L  <°® 

where  L  is  a  set  of  form  control  options.  We  assume  that  (x  ,rk)  is 
perfectly  observed  at  each  time  k,  and  we  seek  to  minimize 


min  J,  (x  ,r  )  =  E<  k  ko 
ko  °  °  \ 


uk 


S-l 

y-7  rvv'v +  xi*i  ^vwv 

If  =  V  -  +s(rk'rk+l!Crk)xk+l+P(rk'rk+lfw,k' 


+  XNKT(rN-l'rN?aN-l)XN  +HT(rN-l'rN;qN-l)XN 


+GT(rN-l'rN;qN-l) 


(10.46) 


where  R(j)>  0, 


Q(i, j ;q)  S’ (i,j;q)/2 


S(i,j;q)/2  P(i,j;q)  J  — 


Vi,j;q)  H^,(i,j;q)/2 


HT(i,j;q)/2  GT(i,j;q) 


>  0 


(10.47) 


(10.48) 


(10.49) 


for  all  i, j  eM  and  qe  L  .  We  will  optimize  over  all  feedback  control 
laws  of  the  types 

uk*  ^k^Xk  ' ‘ ’ * ,xk'rk' ’ ’ * ,rk;Uk' * ' * ,uk-l;°k' ' * ' ,qk-l^  (10.50) 
o  o  o  o 


qk"  9k(xk  '  •  •  • ' xk ; rk  '•••'rk?uk'**-'uk-iqk'*”'qk-l)  * 


(10.51) 


Thus  this  problem  is  a  modification  of  the  Droblems  addressed  in 


chapter  4.  Note  that  we  allow  for  different  costs  0(.),  s(-),  P(-), 

KT ( • )  ,  HT(-  ) ,  and  depending  upon  the  form  control. 

This  lets  us  model  costs  of  maintenance,  switching  to  backup  systems 
and  the  like.  Also  note  that  in  (10.44)*  the  form  transition  pro¬ 
bability  p(i/j;q)  takes  a  different  value  for  each  form  control 

option  q.  _  _ _ 

We  could  obtain  the  optimal  controller  for  any  pre-specif ied  se¬ 
quence  of  form  controls  \  >  using  Proposition  4.2. 

o 

These  quadratic  -in-x^  solutions  (not  piecewise-quadratic)  would 
then  have  to  be  compared  to  determine  the  best  sequence  of  options. 
This  method  of  finding  the  optimal  controls  is  clearly  unsatisfactory 
since  the  number  of  form  control  options  to  be  evaluated  grows  geo¬ 
metrically  as  (N-k)  increases  i.e.,  as 

An  alternative  approach  is  to  apply  dynamic  programming.  At  each 
time  step  we  find  the  optimal  cost  by  considering  the  intersections 
of  the  L  optimal  costs-to-go  from  (x^r  *j)  that  correspond  to  each 

choice  of  form  control. 

Applying  dynamic  programming,  we  have 


VN(XN'rN-l'rN'qN-l)  *  XNKT(rN-l'rN?gN-l)XN  +HT (rN-l ' rN;qN-l) XN  + 


+GT(rN-l'rN?qN-l) 


(10.52) 


.  fVl  R(j)Vl 

Vl'Vl'Vl^1  *  min  \ 


(10.53) 


N-l 


Tl-l 


f  *^«'VqN-l,XN  +  S(j'rN'qN-l)XN  + 

4 

L  L^'vw +  vwi-^v'vi’.ij 


E  JL 


and  for  k  =  N-2,N-3,..., 


VW31  -  mi,‘ 


xk+l  Q(j'rk+l'qk)xk+l  1 

E  1  +  S(j'rk+r<ik>xk+l+p{^rk+1'VV 


+  Vi'Vi'W 


If  x^  is  scalar  and  there  is  no  input  noise,  we  can  use  the  algorithm  of 
chapters  5-7  to  find  the  L  piecewise-quadratic  (in  x^)  costs  associated 
with  the  different  form  control  options  at  time  k.  These  cost  functions 
are  then  evaluated  and  compared  at  each  x^  (by  finding  their  intersec¬ 
tions),  and  the  best  choisce  of  q  is  chosen  for  each  x^  value.  That  is, 
the  control  q^  depends  explicitly  on  the  observed  value  of  x^  as  well  as 
r^.  The  resulting  optimal  expected  cost-to-go  is  thus  also  piecewise- 
quadratic  in  x^.  The  use  of  dynamic  programming  lets  us  "prune"  the 
tree  of  form  control  options;  at  each  time  we  must  compare  at  most  L 
of  them. 

This  approach  to  form  control  problems  can  alsO  be  used  for 
scalar  problems  subject  to  additive  input  noise,  using  the  JLPC 
algorithm  of  chapter  9. 
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In  part  III  of  this  thesis  we  developed  a  basic  approach 
to  the  solution  of  jump  linear  feedback  control  problems  that  con¬ 
sists  of  the  following  tactic: 

.  at  each  time  stage  k  and  from  each  form  rk= j ,  the  con¬ 
trol  problem  is  broken  up  into  a  set  of  constrained 
subproblems  that  are  relatively  easy  to  solve 
.  the  solutions  of  these  constrained  subproblems  are  then 
compared  to  determine  the  optimal  control  law  and  ex¬ 
pected  cost-to-go. 

In  part  IV  of  this  thesis  we  have  used  this  solution  approach  to  de¬ 
termine  the  optimal  feedback  controllers  for  several  more  general 
classes  of  jump  linear  control  problems.  The  results  of  chapters 
8-10  allow  the  application  of  the  methodology  of  chapters  5-7 
to  more  realistic  control  problems. 

In  chapter  8  we  considered  a  modification  of  the  solution 
algorithm  of  chapter  7  that  lets  us  solve  problems  involving: 

.  x  operating  costs  and  terminal  costs  that  are  piece- 
wise-quadratic  in  x  (rather  than  just  quadratic) 

.  cost  pieces  that  are  concave-up  as  well  as  concave- 
down. 

This  jump  linear  piecewise  quadratic  (JLPQ)  control  problem  is 
solved  using  an  off-line,  recursive  algorithm.  As  in  the  JLQ  pro¬ 
blem  of  part  III,  the  optimal  JLPQ  control  laws  are  piecewise- 
linear  in  x  and  the  optimal  expected  costs-to-go  are  piecewise- 
quadratic.  Unlike  the  JLQ  case,  the  number  of  optimal  controller 
pieces  may  grow  at  a  faster-than-linear  rate  as  the  number  of  stages 


from  the  finite  terminal  time  increases.  The  piecewise  structure 


of  the  optimal  controller  is  caused  by  both  the  piecewise-constant 
nature  of  the  form  transition  probabilities  (as  in  part  III)  and  by 
the  piecewise-quadratic  nature  of  the  x-operating  and  terminal  costs. 

In  chapter  9  we  extended  the  solution  methodology  of  chap¬ 
ters  5  -  8  to  address  a  larger  class  of  scalar  jump  linear  control 
problems,  possessing  additive  input  noise  and  a  more  general  class  of 
x-dependent  form  transition  probabilities,  x-operating  costs  and 
x-terminal  costs.  Specifically  we  considered  scalar  jump  linear 
control  problems  with  quadratic  control  penalties  and 

.  input  noise  densities  that  are  twice  continuously  dif¬ 
ferentiated  except  at  a  finite  number  of  points, 

.  x-operating  costs  Q(x,r),  x-terminal  costs  QT(x,r) 
and  form  transition  probabilities  p(i,j=x)  that  con¬ 
sist  of  a  finite  number  of  convex  or  concave  (in  x) 
pieces . 

We  call  this  the  jump  linear  piecewise  convex  (JLPC)  control  problem. 
The  major  extension  in  chapter  9  is  the  inclusion  of  additive  input 
noise  in  the  x-dynamics.  Additive  input  noise  profoundly  changes  the 
nature  of  the  optimal  controller.  The  piecewise-quadratic  structure 
of  the  optimal  cost  and  piecewise-linear  structure  of  the  optimal 
control  laws  is  lost  due  to  the  "blurring”  effects  of  the  noise.  In 
chapter  9  we  show  how  JLPC  control  problems  with  additive  input  noise 
can  be  reformulated  (at  each  time  stage)  as  problems  that  do  not 
possess  input  noise.  This  is  done  by  breaking  the  noisy  JLPC 
problem  into  a  comparison  of  subproblems  that  are  constrained  in 
the  value  of  the  artificial  variable  z  : 


2k+1=a(j)kk+b(j)uk 
-xk+i_  (j)Vk 

which  is  determined  by  control  uk  when  and  rk=j  are  given. 

These  reformulated  problems  can  be  solved  using  the  approach  of 
chapters  5-8. 

The  optimal  JLPC  controller  can  be  obtained  following  the 
steps  of  an  algorithm  which  is  a  generalization  of  those  developed  in 
chapters  7  and  8.  However  many  of  the  algorithm  steps  are  quite  dif¬ 
ficult  or  impossible  to  carry  out  analytically.  Consequently,  nu¬ 
merical  methods  must  be  used,  as  was  illustrated  in  chapter  9. 

This  requirement  of  numerical  approximations,  and  the  fact  the  op¬ 
timal  JLPC  controller  does  not  have  the  nice  inductive  piecewise- 

quadratic  cost  structure  (at  each  time  stage)  motivated  consideration 
of  approximations  of  the  optimal  controller  that  are  easier  to  determine 
and  to  implement  them  the  optimal  controller.  We  examined  one  approx¬ 
imation  scheme  that  yields  controllers  that  have  piecewise— linear 
control  laws  at  each  time. 

In  chapter  10  we  examined  further  extensions  of  the  solution 

methodology  of  Part  III.  We  first  considered  jump  linear  control 

problems  where  the  x  process  is  not  scalar.  This  class  of  problems 
is  far  more  complicated  than  in  the  scalar  case.  We  can,  however, 

obtain  approximate  controllers  for  these  problems  using  the  ideas 
of  section  9.9.  The  next  topic  of  chapter  10  was  jump  linear 
quadratic  control  problems  with  u-dependent  form  transitions.  This 
problem  is  of  practical  importance  since  it  captures  the  issue  of 
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actuator-dependent  failures  and  it  allows  us  to  examine  conflicts 
between  system  performance  and  reliability  requirements.  We 
demonstrated  (via  an  example)  that  these  problems  are  accessible 

using  the  ideas  of  chapters  5-8  when  both  x  and  u  are  scalar. 

In  chapter  10  we  also  consider  JLQ  problems  (and  JLPQ 

problems)  where  the  form  process  can  be  controlled  on  the  basis  of 
observed  x^  and  r^  values.  This  allows  us  to  study  controllers 
that  use  strategies  such  as  preventive  maintenance,  switching  to 
backup  systems  in  anticipation  of  failures  and  the  like.  For  such 
problems  with  scalar  x  and  x-dependent  form  transition  probabilities 
(a  priori) ,  after  one  time  stage  (solving  backwards  from  a  finite 
terminal  time)  the  optimal  control  problem  resembles  the  x-depen¬ 
dent  JLPQ  problem  of  chapter  8.  The  optimal  expected  costs-to-go 
are  piecewise-quadratic  in  x  and  are  indexed  by  the  choice  of  form 
control  qk  as  well  as  the  current  form  r^,  at  each  time  k. 

In  conclusion,  in  part  IV  we  have  extended  the  results 
of  chapters  5  -  7  to  more  general  jump  linear  control  problems 
that  involve  more  complicated  system  and  cost  descriptions.  These 
extensions  are  motivated  by  a  desire  to  make  the  solution  approach 
of  part  III  applicable  to  more  realistic  control  problems.  We 
believe  that  the  results  of  parts  III  and  IV  comprise  an  important 
step  in  the  development  of  techniques  for  fault-tolerant  control 


system  design. 


PART  V 


CONCLUSIONS 
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11.  CONCLUSIONS  AND  SUGGESTIONS  FOR  FUTURE  RESEARCH 

Using  dynamic  programming,  several  classes  of  the  discrete 
time  jump  linear  control  problem  formulation  of  chapter  1  have  been 
solved.  In  this  chapter  we  will  briefly  summarize  the  results  ob¬ 
tained  and  we  will  identify  a  number  of  directions  for  future  re¬ 
search  . 

Let  us  begin  by  considering  the  basic  assumptions  of  the 
control  problems  we  have  studied.  This  thesis  focuses  on  systems 
where  the  form  observations  are  not  noisy.  This  has  not  been  done 
because  the  noisy  observation  case  is  unimportant.  The  reason  for 
this  problem  restriction  is  that,  even  when  the  form  is  perfectly 
observed,  the  solution  of  control  problems  of  this  kind  for  the  x- 
and  u-dependent  form  transition  probability  cases  is  very  difficult, 
previously  unsolved,  important,  and  useful  in  terms  of  the  insight 
which  it  provides  us  regarding  the  tradeoffs  between  reliability 
and  system  performance  goals  in  fault-tolerant  controller  designs. 

An  important  task  for  future  research  is  the  study  of  these  problems 
when  the  form  process  is  not  perfectly  observed.  Two  cases  which 
merit  investigation  are 

problems  where  only  is  observed,  in  the  presence 
of  additive  noise 

.  problems  where  a  noisy  version  of  r^  is  observed 
One  recommended  strategy  for  this  analysis  is  to  consider  suita¬ 
bly  modified  versions  of  the  two  archetypical  single  form-transition 
problems  (i.e.,  commensurate  and  conflicting  performance  and  reli¬ 


ability  goals)  of  chapter  7. 


In  this  work  we  have  restricted  our  attention  to  the 
fault-tolerant  control  of  discrete  time  systems.  As  described  in 
chapter  1,  there  are  several  practical  reasons  for  doing  this.  In 
addition,  the  discrete-time  formulations  of  these  problems  are 
much  more  easily  analyzed  than  continuous-time  ones.  When  dynamic 
programming  is  used  to  solve  discrete-time  trajectory  control  pro-_ 
blems  there  is  no  partial  differential  equation  that  must  be  solved. 
Thus  we  need  not  grapple  with  the  unsolved  nonlinear  partial  dif¬ 
ferential  equations  that  arise  from  continuous-time  versions  of  the 
control  problems  of  parts  III  and  IV.  The  use  of  discrete- time 
problem  formulations  in  this  work  has  enabled  us  to  gain  consider¬ 
able  conceptual  insight  into  the  structure  of  fault-tolerant  control 
systems.  The  continuous-time  version  of  the  Markovian  form  JLQ 
problem  (of  part  II)  was  first  formulated  and  solved  by  Krasovskii 
and  Lidskii  f34)  ,  and  later  by  Wonham  M  and  Sworder  [63]  . 

The  study  of  continuous-time  versions  of  the  problems  of  parts  III 
and  IV  is  a  challenging  topic  for  future  research.  The  solution 
of  a  class  of  nontrivial  continuous  time  jump  linear  control  problems 
with  nonmarkovian ,  x-dependent  form  transitions  would  be  a 
significant  contribution. 

In  part  II  of  this  thesis  we  considered  JLQ  control  pro¬ 
blems  for  n-dimensional  systems  with  Markovian  form  transitions. 

The  noiseless  case  was  addressed  in  chapter  3,  and  in  chapter  4. 

This  problem  formulation  was  extended  to  include  jump  costs,  af¬ 
fine  resets  of  x  and  additive  white  input  and  x-observation  noises 
(but  with  the  form  process  still  perfectly  observed) .  The  optimal 
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control  laws  that  are  obtained  are  linear  in  x,  with  a  different  law 
for  each  form.  The  expected  costs -to-go  are  quadratic  in  x  (for 
each  form) .  All  of  the  control  gains  and  costs  are  obtained  by  sol¬ 
ving  off-line  a  set  of  precomputable  Riccati-like  difference  equa- 

t 

tions  (one  for  each  form) .  Necessary  and  sufficient  conditions 
are  derived  for  the  existence  of  a  set  of  steady-state  constant  ex¬ 
pected  cost-to-go  functions.  It  is  shown  that  the  corresponding  set 
of  time- invariant  steady-state  control  laws  stabilizes  the  control¬ 
led  system,  in  that  E^x^x^j-^O  as  (k-kg)  and  that  the  steady- 
state  control  laws  minimize  the  limiting  expected  cost-to-go  as 
(N-kQ)-»«f  with  finite  optimal  expected  cost. 

The  presence  of  additive  (usually  Gaussian)  white  observa¬ 
tion  and  input  noise  does  not  complicate  these  problems.  Since  the 
form  is  perfectly  observed  (with  delay) ,  a  separation  theorem  like 
that  of  the  standard  LQG  problem  follows.  In  each  form,  a  Kalman 
filter  estimates  x,  and  this  estimate  is  then  used  by  the  control 
law  for  that  form. 

The  results  of  chapters  3  and  4  suggest  several  directions 
for  future  research: 

1.  Proposition  3.1  specifies  a  set  of  coupled  recursive 
Riccati-like  difference  equations  whose  solution 
specifies  the  optimal  JLQ  controller.  An  efficient 
technique  for  solving  these  coupled  equations  is 
needed. 

2.  Proposition  3.2  provides  necessary  and  sufficient  con¬ 
ditions  for  existence  of  the  optimal  steady-state 

JLQ  controller.  These  conditions  are  not  easily 


tested  for  nonscalar-x  problems,  however  since  they  re¬ 
quire  the  simultaneous  solution  of  coupled  matrix  equa¬ 
tions  containing  infinite  sums.  In  Corollaries  3.4 
and  3.5  sufficient  conditions  that  are  based  upon  singu¬ 
lar  values  are  presented  that  are  somewhat  more  tes¬ 
table  for  some  problems .  However  the  derivation  of 
easily  calculable  conditions  for  the  JLQ  steady  state 
problem  (like  the  controllability  and  observability 
conditions  of  the  L Q  problem)  remains  an.open  question. 

3.  A  more  restrictive  sufficient  condition  for  the  ex¬ 
istence  of  steady-state  optimal  controllers  for  the 
continuous  time  version  of  the  problem  was  developed  by 
Wonham{76^.  The  attainment  of  necessary  conditions 
for  the  continuous  time  problem  remains  an  open  ques¬ 
tion. 

In  part  III  (chapters  5,6  and  7)  we  have  considered  scalar  JLQ 
control  problems  that  involve  state-dependent  structural  changes. 

This  class  of  nonlinear  stochastic  control  problems  yields  controller 
designs  which  endow  systems  with  fault-tolerance,  in  that  the  con¬ 
troller  takes  into  account  known  system  limitations  and  failure 
likelihoods  so  as  to  achieve  the  best  tradeoff  between  system  re¬ 
liability  and  performance  goals.  The  optimal  controller  attempts 
to  minimize  the  cost  incurred  by  the  usual  LQ  regulator  action, 
and  by  driving  the  system  state  to  regions  where  the  likelihood 
of  undesirable  form  shifts  is  reduced. 

We  have  formulated  and  solved  a  class  of  scalar-in-x, 
noiseless  JLQ  problems  with  x-dynamics  that  would  be  linear,  if 


not  for  random  x-dependent  jumping  parameters.  These  problems  pos¬ 
sess  form  transition  probabilities  that  depend  upon  x  in  a  piece- 
wise-constant  way.  For  this  class  of  problems  we  have  developed  a 
procedure  that  calculates  the  optimal  expected  costs-to-go  and  con¬ 
trol  laws  “dffcliBe",  in  advance  of  system  operation.  The  procedure 
determines  the  optimal  controller  inductively,  backwards  in  time 
(for  finite  time-horizon  problems).  At  each  time  the  optimal  con¬ 
troller  is  obtained  by  calculating  and  comparing  a  growing  number  of 
quadratic  functions.  These  quadratic  functions  are  computed  via 
Riccati-like  difference  equations.  We  established  that  the  optimal 
control  laws  are  piecewise-linear  in  x  (with  x^x0  terms)  and  the 

optimal  expected  costs-to-go  are  piecewise-guadratic  in  x  (with 
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x  ,x  ,x  terms) .  The  different  controller  pieces  arise  from  using 
the  control  to  actively  hedge.  We  also  identified  and  examined 
several  basic  qualitative  properties  of  the  optimal  JLQ  controller. 
These  included  hedging-to-a-point ,  regions  of  avoidances  and  the  end 
pieces  and  middlepieces  of  the  expected  costs-to-go  and  control  laws 
In  chapter  7  we  used  the  combinatoric  properties  established  in 
chapter  5  and  the  results  of  chapter  6  to  construct  an  algorithm 
for  the  efficient  computation  of  the  optimal  controller.  This  al¬ 
gorithm  was  presented  in  flowchart  form  and  described  in  detail.  A 
very  useful  topic  for  future  efforts  is  the  development  of  mecha¬ 
nized  schemes  for  implementing  the  flowchart  steps  for  general  JLQ 
problems.  This  will  probably  require  the  use  of  a  high-level  sym¬ 


bolic-manipulation  computer  language, 


The  class  of  JLQ  control  problems  addressed  by  chapter  5 
is  extremely  rich.  The  resulting  optimal  controllers  can  exhibit 
a  wide  variety  of  qualitative  behaviors.  Analytical  characteriza¬ 
tions  of  these  JLQ  controllers  that  are  sufficiently  general  to 
encompass  the  entire  problem  class  tend  to  be  uninformative,  since 
so  many  diverse  behaviors  must  be  simultaneously  considered.  We 
chose,  therefore,  to  focus  our  attention  on  problems  that  lend  in¬ 
sight  into  the  kinds  of  qualitative  JLQ  controller  behaviors  that 
are  appropriate  in  fault-tolerant  control  applications.  We  con¬ 
sidered  two  archetypical  problem  classes  in  detail.  In  one  of  these 
classes  the  two  goals  of  high  performance  and  high  reliability  are 
commensurate.  In  the  other  class  they  are  at  cross  purpose*.  We 
examined  the  parametric  dependence  of  the  hedging  regions,  regions 
of  avoidance,  stability  properties,  and  local  minima  in  the  expec¬ 
ted  costs-to-go  for  these  controllers.  Under  certain  assumptions 
for  these  problems,  the  solution  algorithm  of  chapter  7  reduced 
to  the  solution  of  (increasingly  many)  sets  of  difference  equa¬ 
tions  (as  N-k  increases) .  This  made  these  problems  amenable  to  de¬ 
tailed  analysis  and  it  let  us  illustrate  some  of  the  controller 
properties  and  qualitative  issues  that  arise  from  the  use  of  control 
to  achieve  both  reliability  and  performance  goals.  There  are  pro¬ 
bably  many  other  special  classes  of  problems  within  the  general 
problem  class  of  chapter  5  for  which  similar  detailed  study  can  be 
effected.  Of  course,  they  need  not  correspond  to  fault-tolerant 
control  applications.  The  search  for  other  special  problem  classes 
and  their  study  may  be  a  fruitful  line  of  research. 

For  the  general  problem  of  part  III,  as  the  time  horizon 


of  the  problem  becomes  infinite  the  number  of  pieces  in  the  opti¬ 


mal  controller  becomes  infinite.  That  is,  the  optimal  infinite 
time-horizon  problem  cannot  be  obtained  by  any  finite  algorithm. 

For  the  two  problem  classes  examined  in  chapter  7  we  could  analyze 
the  infinite  time  behavior  of  the  controller  and  obtain  the  optimal 
steady-state  controllers  as  (N-k)*«*,  since  the  optimal  controller  at 
each  time  can  be  obtained  from  the  solution  of  increasingly  many  dif¬ 
ference  equations  without  making  the  comparisons  and  tests  in  the 
solution  algorithm  that  are  needed  in  general.  The  establishment  of 
general  conditions  for  the  existence  of  steady-state  optimal  control¬ 
lers  for  JLQ  problems  is  an  open  question. 

The  steady-state  solutions  that  were  obtained  for  the  two 
problem  classes  studied  in  detail  here  exhibit  a  structure  that  sug¬ 
gests  a  natural  approximation  to  the  steady-state  optimal  controller 
(both  for  these  problems  and  the  general  class  of  problems  in  chapter 
5) .  These  approximations  correspond  to  "finite  look-ahead"  control¬ 
lers  which  ignore  eventualities  that  might  occur  beyond  some  fixed 
planning  time.  By  ignoring  the  far  future,  optimality  is  lost  in 
these  controllers  but  the  computational  burden  of  determining  them 
and  the  complexity  (and  cost)  of  their  implementation  is  reduced. 

This  finite  look-ahead  controller  was  developed  in  section  7.7. 

The  evaluation  of  this  controller  for  general  JLQ  problems  and  the 
derivation  of  better  suboptimal  controllers  are  open  questions  for 
future  research. 

In  part  IV  (chapters  8,9  and  10)  we  considered  a  number 
of  extensions  to  the  basic  solution  approach  of  part  III,  as 
described  in  section  10.5.  Among  the  myriad  "next  steps"  arising 


I 


5 


from  these  chapters  we  suggest  that  the  following  may  be  particularly 
fruitful: 

1.  For  the  noisy  JLPC  problem  of  chapter  9,  consider  in 

detail  the  approximation  of  problems  involving 
input  noise  densities  that  are  piecewise-constant 
and  form  transition  probabilities  that  are  piecewise- 
constant  in  x.  At  the  first  time  stage  the  z-conditional 
cost  will  be  piecewise-cubic.  If  we  approximate  it  by  a 
piecewise-quadratic  function  then  the  JLPQ  algorithm  of 
chapter  8  can  be  applied  for  one  time  step.  This  will 
result  in  a  piecewise-quadratic  (in  x)  expected  cost. 
Therefore  at  the  next  time  stage  back  we  will  again  have 
a  piecewise-cubic  z-conditional  cost.  Thus  there  is  a 
nice  recursive  structure  to  this  approximation  idea. 

The  key  question  is  how  to  efficiently  approximate  the 
cubiciz  pieces  by  quadratics. 

2.  For  the  n-dimensional  x  problems  of  section  10.2, 
consider  special  cases  that  look  similar  to  the 
scalar-x-and-u  example  of  section  10.3.  In  particular, 
what  kind  of  hedging  behaviors  will  the  optimal 
controller  demonstrate? 


In  this  thesis  we  have  considered  the  control  of  dynamic 
systems  subject  to  abrupt  structural  changes  at  random  times.  This 


work  was  motivated  by  the  need  for  design  techniques  that  yield  fault- 
tolerant  systems.  We  have  concentrated  on  the  tradeoffs  and  conflicts 
between  system  reliability  and  performance  goals.  Specifically, 
we  considered  the  attainment  of  fault-tolerance  through  control  stra¬ 
tegies  rather  than  by  direct  redundancy.  This  is,  of  course,  only 
part  of  the  overall  fault-tolerant  design  problem.  However  the  pro¬ 
blem  formulations  here  capture  many  important  issues.  We  believe 
that  the  problems  that  are  addressed  and  the  results  obtained  in 
this  thesis  provide  an  important  step  in  the  development  of  a  gen¬ 


eral  theory  of  fault-tolerant  control. 


APPENDIX  TO  PART  I 


Some  Notational  Conventions  Used 

We  list  here  several  notational  conventions  used  in  this  thesis. 

1.  above  a  cost  or  cost  parameter  indicates  that  it  is  conditioned 
on  the  value  of  the  form  process  at  the  previous  time  step 

2.  above  a  boundary  interval  (y  )  indicates  that  this  is  a 
conditional  quantity  parameterized  by  the  z  process  of 
chapter  9. 

3.  ^  above  a  cost  or  cost  parameter  indicates  that  this  is  a 

quantity  that  is  parameterized  by  the  z  process  and  is 
conditioned  on  the  previous  time  step  's  form  process  value. 

4.  A/  above  a  quantity  indicates  that  it  is  an  approximate 

version  of  the  true  optimal  controller's  whatever. 

5.  Cost  parameters  followed  by  arguments  (t:i)  denote  the  tth 
controller  piece's  parameters,  if  the  system  is  in  form  i. 

6.  superscript  t,U  indicates  the  "unconstrained"  solution 
(control  law, cost,  etc)  to  a  constrained  subproblem 

superscript  t,L  indicates  the  constrained  left-boundary 
solution 

superscript  t  R  indicates  the  constrained  quantity  on  the 
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right  boundary. 


B.  APPENDICES  TO  PART  II 


B.l  Proof  of  Proposition  3,1 

2 

The  cost  is  quadratic  for  k=N.  Given  that  =  x^K^fjJx^ 

for  each  j  at  some  time  k,  the  optimal  u  given  (x  n ,r  )  can 

Jv**X  K“  X  K—  X 

be  obtained  by  minimizing 


VlBk-l(rk-l)Vl  +  E 


r 

Xk-l'rk-l 

VVV 

1 

(B.1.1) 


=  u^_1Rk_1  (r  +  E 


1 

X, 

* 

N 

_ i 

X 

• 

x  ,  ,r  , 

k 

k 

k-1  k-1 

k 

_W  _ 

4 

subject  to  the  dynamic  constraint  (3.1). 

Using  the  fact  that  r^  and  are  conditionally  independent 
(given  xk_^,rk_^) ,  and  that 


M 

W  +  Kk(rk> 

=  £  Pk(rk-l'j) 

j“l 

.vj). 

(B.1.1)  can  be  minimized  by  substituting  for  (using  (3.1)), 
differentiating  with  respect  to  u^_^  and  setting  the  result  to 
zero,  resulting  in  (3.7).  Condition  (3.5)  guarantees  that  the  inverse 
matrix  in  (3.7)  exists  for  each  j€M.  Substitution  of  (3.7)  in  (3.6) 
yields  (3.8).  It  is  easily  verified  that  K^tj)  is  a  symmetric  positive 


semi-definite  matrix  for  each  j,  that  the  solution  given  above  always 
exists  if  conditions  (3.5)  one  met  (and  all  parameters  are  finite)  and 
by  contradiction,  it  can  easily  be  shown  that  this  optimal  solution  is 
unique.  Recursive  application  of  this  argument  yields  the  desired 
results. 


B.2  Establishing  Condition  (4)  of  Proposition  3.2 


From  (3.17)  we  see  that 


Vi  = 


(A.-B.L.)x,  =  D.x, 
3  ]  ]  X  3  k 


if  yj 


where  and  are  the  steady-state  values  established  by  condition 
(1) — (3)  of  the  Proposition.  Thus  if  rk+1ai' 


Vi'iVi '  \Ki\ 

-  \[D;Kiyv*k 


where  the  last  equality  follows  from  (3.12). 
Hence 


■WVk+l  ■  xOKrxO  -  l  x'i^*LALr.)xi 
0  i»0  i  111 

Now  the  left  side  of  (B.2.1)  is  bounded  below  by  zero. 


.  (B.2.1) 

and  thus 


x!  (Q.+L'R,L.)X.  -*•  0  (B.  2.2) 

*  ]  3  3  3  K 

for  all  nontransient  j  €  M. 

A  contrapositive  argument  now  shows  the  sufficiency  of  condition 
(4) :  Suppose  that  the  steady-state  optimal  controller  yields 


lim  x=x  >  0 

k“V" 

hence 

lira  1 lxJ I  =  I lxi l>  0 

(k-kQ)-x° 

but  that  the  expected  cost  is  finite. 


Since  the  system  will  return  (with  probability  one)  countably 
infinitely  many  times  to  each  form  in  one  of  the  closed  communicating 


subsets  of  M,  an  infinite  cost  must  be  incurred  by  (B.2.2). 

□ 

B.3  Proof  of  Proposition  3.2  conditions  (l)-(3)  and 
Corollary  3.4  ~ 

For  Proposition  3.2:  Let  kQ=0  for  simplicity.  Note  that 


Expected 

Cost 


l 

k=0 


WVi  - 


'expected  cost 
while 

r  €  T 


/expected  cost  while  the 
+  I  form  is  in  a  closed 
communicating  subset 


V 


T-l 
k 


;^[lt'k+ia(EfcH)Vl  +  j[*iUiaWVl+uiR(rk’u 


(If  T=0  (i.e.:  rQ  ^  T)  then  the  first  sum  is  zero). 

Conditions  (1)  and  (2)  of  Proposition  3.2  concern  the  second 
sum  above;  condition  (3)  relates  to  the  first  one.  If  i  is  an 
absorbing  form,  then 


expected  cost  to  go 

00 

E 

from 

f 

“  x1. 

l  (A.-B.F.)  C(Q.+f!r.F. ) (A.-B.F. )fc 

(xk'rk”i)at  k 

k 

111  *1  1  1  X  111 

t=0 

m 

and  thus  (3.18)  follows  immediately. 


From  above  it  is  clear  that 


(Q.+F'.R.F.) 

3  3  3  3 

x'  (s^G^x^)  a  x'  (s) 

00  1  £ 

(l-p..)T  pt"1(A.-B.F.) 

33^  33  3  3  3 

+  p 

2^4 

k€T 

(A.-B,F.)t 

3  3  3 

_k/j 

for  any  x' (s J ,  thus  for  each  j  €  T: 


G.  =  (1-p..)  7  pt_1 (A.-B.F.) 

3  31  t“j_  31  3  3  3 


't 


(Q.+f!r.f.) 
3  3  3  3 


l  G 

kST  k  1_Pjj 


[(A.-B.F.) 
3  3  3 


as  in  (3.20).  If  and  only  if  there  exist  {G_.  :  j€T}  that  are  positive 
definite  and  finite  valved  (each  element)  satisfying  these  coupled 
equations,  cure  the  above  equalities  valid  (the  positive-definitness 
must  be  a  property  of  the  G_. ' s  by  (3.5)). 

Similar  arguments  yields  the  {z^,}  in  (3.19).  n 


For  Proposition  3.4,  note  that  for  absorbing  forms  i 

[Jo  'W/'VWi1  ‘Wi’ i 


<  1 1 ^ 1 1 2  II VFiRiFi II  l  II (Ai-BiFi) 


t.  .2 


Now  suppose  the  system  is  in  a  transient  form  i  e  T. 


Let  si  (i=l,2,.„.)  be  the  times  when  the  system  form  changes,  with 


r(si’  -  ri 


-  0 


Then  the  {(Sj+1~sJ;  are  independent  random  variables. 


Given  that  we  are  in  (x(s.),  r.=j)  at  s., 

11  i 


Ejx' (si+1)x(s.+1) 


x(si)(  = 


V3 


1  i_p3 j *  • 


Let 


x  (sJG^xfsJ 


expected  cost  from  j  6  T  until  T  is 
exited,  given 


x(si>,  s±  <  x,  ri*j 


Clearly. 


x'  (si)GjX(si) 


expected  cost 
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.  )G.  x(s. 
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X  j 

k€T  Pjj  (given  x(s.) 
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+ 
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keT  1-p , .  i+l  k  i+l  1  l  l 


hence  (3.28)  implies  (3.18).  Similarly  (3.29)  implies  (3.19)  and  (3.30) 

implies  (3.20).  For  nonre-enterable  transient  forms,  the  sum  of 

terms  (&  €  T,  Vj)  in  (3.20)  is  zero;  thus  (3.31)  implies  (3.20)  trivially. 


c. 


APPENDICES  TO  PART  IH 


C. 1  One-Step  Solution  Equations  (for  Proposition  5.1) 


For  t  =  1,2, .  . .  ,\jr  let 

iC  i  X 


,31 


be  the  index  of  X^(  )  valid  in  (5.3)  when 


Vi  e  &i‘« 


-31 


be 


the  index  of  the  piece  of  V  (x  ,r  =i) 

i\T  X  1\T  X  Kt  X 


valid  when  X]<+1  e  ^+]_(  -  , 


as  in  (5.27)  -  (5.28) . 

Define  the  conditional  cost  parameters 
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H;+i(t) 
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i=*i 


31 


(C.1.2) 
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where 


-i  1  b2lj)  isLl(t)  b2Iil  M 

eilt,  -  [1  +  —f  i  ■  -  I  +  ■&!<«  (C.1.4, 

R  ( 3 ) 
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2  *  j 
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Note  that 
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ft 

- 
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+ 
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(C/1.6) 


Then  the  candidate  costs-to-go 
laws  in  (5.30),  and  the  optimal 
controls  are: 


in  (5.29)  and  corresponding  optimal  control 
Xk+1  S  A^+i (t)  values  achieved  by  these 


,t,L 


(vj) 


(t-l)  +  GjJ(t-l;t) 


(C.1.7) 


Uk,t,(Xk'j)  =  *  ^  Xk  +  (t-l)  (C .1.8) 

4i  (vj>  *  vi  it'i>  (c'1'91 

for  t  =  2, 3, ... ,  if 

a(j)  \  <  0^(t) 

vk'U(xk'j)  *  \  *k(t>  +  *k  Hk(t)  +  Gk(t)  (C.1.10) 

uj,0(xjt,j)  =  -L-J(t)  ^  +  rj  (t)  (c. l.ii) 

Xk+l(Xk'j)  +  Ca(j)  "  B(j)  Lk(t)3xk  +  b(j)  Fk(t)  (C.1.12) 

for  t  *  1,2, ..  .  if 
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f°r  *  =  1r2 . *k+l  ~  1 


if 


oj(t)  <_a(j)  ^ 


The  quantities  on  the  right-hand  side  of  (C.1.10)  -  (C.1.12)  are  computed 
by 

K^(t)  *  KT(j)  (alljV 

a2 ( j)  R(j)  K^+1(t) 


K^(t) 


R(j)  +  b2(j)  K^+1(t) 
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(C.1.20) 
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=  a(j)/b(j) 
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(C. 1 .24) 


for  t  =  1 , . . . , 


(C.1.25) 


Regarding  the  notation  used  here, 

~  denotes  an  actively-constrained  quantity 
a  denotes  a  conditional  quantity  . 
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It  is  straightforward  to  verify  that  (5.6)  and  (ii)  of  Proposition  5.1 
imply  that 
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for  each  j  8  M  ,  and  thus  (2)  of  Proposition  5.1  holds. 

The  values  of  ( j )  ,  {6^(t)  :t=l, . . .  j)  -l},and 
G^Ct: j) ,  L^(t:j)  and  F^Ctcj)  are  assigned,  for  each  j  6  M,  by 
the  minimization  indicated  in  (5.37) .  The  derivation  of  (C.l 
is  done  in  the  next  section. 


(C. 1.26) 
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C.  2  Derivation  Of  (C.1.7)  -  (C.1,25), 


From  (5.15),  (C.I.i)  —  (C .1.3)  we  have  that 


uj  R(3) 


Vk[*k'rk*^tJ  -  ®i^ 

\ 


\*i  &i(t>  +  sL(t)  vi 


(C.2.1) 


*k+l  6  Ak+l(t) 


Gi+l(t) 


From  (5.1)  we  have  that 
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Thus  (C.2.1)  becomes 
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To  minimize  (C.2.3),  differentiate  with  respect  to  xk+^  and  set  to  zero. 
We  find  that  the  optimal  xk+^  is 


Vl  -  2.(jl  Mi)  ^  -  b2lj) 
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(if  this  xk+^  is,  in  fact,  in  A^+^(t) . 


Also, 
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(C.2.6) 

The  left  and  right  sides  of  (C.2.6)  are  defined  to  be  0?(t)  and  Q^(t), 

K  K 

respectively, as  in  (C.1.4)  -  (C.1.5). 

With  the  definition!  of  (C.1.10)  -  (C.1.12)  holding,  the  substitution 
of  (C.2.4)  into  (C.2.3)  yields  (C.1.16)  -  (C.1.18)  and  the  substitution 
of  (C.2.4)  into  (C.2.2)  yields  (C.1.19)  -  (C.1.20). 

Now  if 

a(j)xk  <  ej(t) , 

(C.2.5)  implies  that  the  best  we  can  do  is  to  drive  xk+1  to  (t-1), the 

left  boundary  of  A?  ,  (t)  .  Thus  from  (C.2.2) 
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(C.2.7) 


which  yields  (C.1.8)  with  (C.1.21)  -  (C.1.22)  . 

Substituting  x^+^  -y^^Ct-l)  into  (C.2.3)  yields  (C.1.8)  with  (C.1.21)  - 
(C.  1.23)  ;  here  S  =»  t-1  and  t=2, _ ,  ^^+1'  Similarly 

a(j)  ^  >  ej(t) 

case  yields  (C.1.13)  -  (C.1.15);  here  S  =  t  and  t-1,...,  \J?+1  in  (C.1.18) 
If  b(j)  =0  then  the  optimal  control  is 


W  VJ>  ’  0 


Vi  ■  a(j)  *k 


and  cost 


VVV3>  -a2(jl  \  *  a<3)  ^i(t)  -k 

♦  ak.l!t) 

where  the  index  t  is  determined  by  which  region  (t)  the 
value  is  in  (for  each  x.  value) . 


Proof  of  Proposition  5.2 


To  prove  this  proposition  we  must  first  establish  the  following 

j  a  | 

relationships  between  0  (t)  and  the  slopes  of  V  (x  r  =; 

K  K  Kt.I  JC  *  X  |  JC 

at  the  points  r£+1(t)  :  t  =  1,...,  <P^+1  -  1}. 

Lemma  C.3.1;  The  following  relationships  hold: 


!•  ejj(t)  >  ejd) 


if  and  only  if 


3xk+i|  j  +  sxk+il  j 

xk+l“tYk+l(t_1)  1  *k+l  k+l 

[yj+1(il-l)-  Yk+l(t*1)] 

b  (j) 


-I  + 


2.  Q^(t)  >  eja) 


dOJ 

k+l 


if  and  only  if 

a°k+i 

3xk+i 


xk+l*tYk+lCt) ] 


*k+rlw 


fp-^k+l^  -Yk+l(t)J 

b  (3) 


3.  ej(t)  >  0^(4) 


if  and  only  if 


& 

k+l 


Vi*[Vi(t)1 


[Yk+i(,l'1)  -v.3.,<tn 


(C.3.1) 


(C.3.2; 


(C.3.3) 


Proof  of  Lenjna  C.3.1 


797 


That  is,  (*k+^ 1 3 )  has  a  discontinuous  increase  at  (t) 

'  k+1 

This  can  happen  only  if  Y^+1 (t)  is  a  form  transition  probability 
discontinuity  v^tf)  (for  i  eC^t  g  v  -l})  . 

Then  clearly 


<  Vk+1'L  <V” 


So  in  this  case  '  (x^, j)  cannot  be  optimal  for 

t  R 

vk'  (x^/j)  may  be.  Similarly,  if 


(G.3.10) 


any  x^ .  However , 


^♦1  'IT 

(which  implies  that  Yk+1<t)  is  a  form  transition  probability  discontinuity) 


Vk+1,L  <V*>  <  v£'R<V3> 


(C.3.11) 


hence  (x^j)  cannot  be  optimal.  Thus  we  need  consider  only  the 

candidate  costs-to-go  listed  in  the  statement  of  Proposition  5.2.  D 


» 


C. 4  Proof  of  Proposition  5.3 

1.  Differentiation  of  Vt#(^  }  ,  Vfcj*  }  and  V*'L  (x^j)  and 

j  j,  k,  j 


*?'U<VJ)'  \'R<Vj>  “d 


comparison  with 

u^'Nx^j)  (for  appropriate  values  of  t)  yields  (1)  directly 

^Vk(Xk'rk=j) 


2.  At  joining  points  <5  where 


3xk 


V  5 


exists,  u^(x^,r^=j)  and  x^+^(x^  r^*j)  316  olearly  continuous 
from  (1) , 

3.  At  a  joining  point  x  =  6  where  the  slope  of  V  (x  ,r  =j) 

k  jc  k 

decreases 


uk(6+) 


-  V<5> 


-b(j) 

r  3v*k.i> 

• 

2a(j)R(j) 

+ 

m 

<5  * 

<5~ 

* 

but 


3Vk,xk 

^k 

•v31 

< 

xk=6+ 

»VVV3> 

5*k 

!v« 

have 

b(j) 
a  ( j) 

>  0  uk(6+) 

>  V«"> 

b  ( j) 
a(  j) 

<  0  =*  \  ( 6+) 

<  Vs  1 

hence  3(i),  and  from  (5.48) 


Vi(5  '3)  •  W5 '3) 


=  a(j)  [5  -  <$  ]  ~ 


b2(j) 

2a(j)R(j) 


-b2(j) 


2a(  j)R(  j) 


3Vk(fi  ,j) 


3\U+  /j) 


3\(6  /  j) 


3\(6  »j) 


hence  3(ii) 


4.  From  3(ii)  we  have  that  the  mapping 


*k,->  ’WWi) 

increases  discontinously  at  joining  points  where 

is  not  differentiable,  and  from  (2)  ,  the  mapping 
is  continuous  at  other  joining  points. 

Now  between  joining  points,  if  the  optimal  cost  corresponds 
to  hedging  to  a  point  then  clearly  the  mapping  is  constant. 

If  the  optimal  cost  does  not  correspond  to  hedging  to  a  point, 
then  in  such  a  region 


for  some  t  e(l, .. .  )  (from  Cl. 10).  Thus  from  (5.48) 


■Wvvjl  =  a(:i>xk  - 


b2(j) 


2a( j)R( j) 


^VW11 


ivi 


a,(j)  - 


b2(i)  3  WV^ 

2a( j)R( j)  C3xk)2 


*(j'  ~  2.(3,%)  2Kj(t> 


a(j)  - 


b2(1)  2a2(3)R(3)  K^^lt) 

2a(j)R(j)  R(j)  +  b2(j)  i^+i(t) 


by  (C.1.16) 


a(  j)R(  j) _ 

R(j)  +  b2 ( j)  K^+1(t) 


if  a(j)  >  0 


<  0  if  a(j)  < 


Thus  we  have  4 ( i) ,  (ii) 


Clearly  from  (5.48)  we  cannot  have  regions  of  avoidance  except  when 


3Vvrk=j> 


is  discontinuous,  and  from  4(i),  3(ii)  we  must  have 


a  region  of  values  that  are  not  attainable  using  the  optimal 

control  associated  with  each  discontinuous  joining  point. 


Fact  (5)  follows  directly  from  the  monotonicity  of  the  mapping 


v— * >  Vi  ‘VV’> 


in  4(i) ,  since  each  candidate  cost  corresponds  to  driving  x^+1  into 
a  different  region  of  values;  if  a  certain  candidate  cost  is  optimal 
over  two  disconnected  intervals  then  the  monotonicity  of  the  mapping 


is  violated. 


3 


C.5  Proof  of  Proposition  6.1 


a  i 

Condition  (6.1)  -  (6.4)  quarantees  that  K£+^(t)  and  hence 

K^(t)  are  nonzero  for  all  k  =  N,  N-l, _ ,0  (see  appendix  C.2) .  As 

|x^j  grows  large,  the  candidate  costs-to-go  V^'u (x^, j) ,  v^'L(x.,j 

t  R  2 

v^'  (x^,j)  are  dominated  by  their-^  terms. 

For  constrained  costs  (V^'^  (x^>j)  or  this  term  is 


2  ~j 

t .  u 

and  for  unconstrained  costs  (x^,j)  is 


(all  k) 


4  4(t> 


For  any  t  =  1 , . . . ,  ip. 


k+1  ' 


Ki<t> 


(C.5 .1) 


To  see  this,  note  from  (C.1.16),  (C.1.21)  that 


£j  _  a2(j)  R(j) 

\  ,2... 

b  (]) 


a2(j)  R(j)  K?  .  (i) 

-  - 2 

R(j)  +  b  (j)  K^+1(i) 


Thus  for  x^  large  enough,  the  optimal  expected  cost-to-go 

Vfc(x^,rk-j)  must  be  one  of  the  unconstrained  ones.  But  for  x^_  small 
i  u  /  ^it+i  0  \ 

(large)  enough,  V^'  (x^,j)  '  (x^j)  *s  the  only  unconstrained 

cost  that  is  eligible,  and  (6.6)  -  (6.24)  follow  directly  from 
Appendix  C . 1 .  n 


C.6  Proof  of  Proposition  6.3,  part  (iv) 


We  first  verify  (6.48) .  Each  form  j  is  assumed  to  be  stabilizable. 
Thus  by  Proposition  6.2  the  steady-state  endpieces  of  the  optimal  expected 
costs-to-go  in  each  form  are  finite  (for  finite  x) .  The  closed-loop 
optimal  gain  in  the  left  endpiece  of  ( x^ , j )  thus  approaches  the  following 

limiting  value  as  (N-k)  -*■  00  : 


b ( j)  l£e  (j) 


a ( j)  R ( j )  K®  (3) 


by  (6.39) 


by  (6.40) 


a( j)  [1  - 


2  ~Le 
o  ( j)  K^*(j) 

2  ^  1 
R(j)  +  b  (j)  K*~(j) 


by  (6.37) 


a(  j)  [ 


fall.!!  K^d) 

R(j)  K»  (3) 


Now  for  each  j  this  limiting  value  of  the  closed  loop  optimal  gain 
magnitude  must  be  less  than  one  (i.e.  stable)  since  the  steady-state 
endpiece  of  the  cost  function  is  finite.  That  is,  we  have  (6.48) : 

b^  ( i )  A  le 

0  <  a(j)  <  1  +  K«  (j) 

In  particular  there  exists  a  positive  integer  z  <  «°  such  that  for 


each  jeM  we  have 


i  ♦  — ^ -  <  i 

1  *  Inf 


for  all  (N-jj)  >  z. 


(C.6 . 1 ) 
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Now  by  (iv) /  there  exists  a  form  l  that  is  accessible  from  j 
which  has  an  x-dependent  form,  transition  probability  and  can  be 
repeatedly  re-entered.  Thus  there  is  some  finite  number  l  such 


Pr{rk+p  -  2|rk-  2}  >0 


V  P  >  l 


and,  by  (C.6.1),  the  closed-loop  optimal  control  driving  (x^,rk=p) 

to  (x^+p  'r]c+p=i^  is  greater  than  one  for  any  form  sequence  from 

r,  =  £  to  r.  ^  =  i 
k  k+p 


-  C(ll=K>5*lj) 


hence 


<  sLw 


Similarly, 


«£tak«>-i)  >  «£+p<v0tl>-1 


Hence 


Islpl  <  l4l 


Since  l  can  be  repeatedly  re-entered 


(5*1  -*■  00  as  (N-k)  °° 


consequently  since  JL  is  accessible  from  j/  +  00  as  well.  O 


806 


C. 7  Proof  of  Proposition  6.5 


Parts  (1)  and  (2)  of  Proposition  6.5  are  obvious.  Let  us 
consider  part (3),  with  6^  defined,  as  in  the  Proposition,  to  be  the 
smallest  positive  form  transition  probability  discontinuity  location, 
for  all  p(j,i)  where  ie  Cj.By  Proposition  6.4,  for  x^  in  the  right 
middlepiece  of  V^(x^,rk=j)  —  that  is,  for 

0  1  \  1  (C.7.1) 

we  have  the  optimal  control  law  (6.57)  yielding  the  optimal  x^+1  value 


RM 


Vl  =  [a(j)  -  b(j)  Lk  (j)l  ^ 


a(  j) 


j>  t1' 


T 


b2(j>  ^ 

R(j)  +  b2(j)  ^(j)  J  k 


J  ** 


a(  j) 


^  c  ««■ 


1  + 


R(j) 


(C.7.2) 


Now  by  definition,  e  (0,  <5^1  implies  that  xjc+j_  e  10,  1 

for  all  i  e  C  as  well  as  x  .  e  [0,  v..(t))  for  all  i  e  C  and 

J  Ji  J 

t  *  1 , . . . , v.  .-l. 

Thus,  in  particular  we  have 

hence  (6.60)  and  if  jSCj  then  we  have  0  <_  x^+^  <  6  ^  hence  (6.61)  . 


A  symmetric  argument  proves  (4) . 


We  prove  the  proposition  by  induction;  (6.72) -(6.76)  are  trivially 
true  for  k  =  N  by  (6.79).  Now  suppose  (6.72)  -  (6.78)  hold  at  some  time 


k  +  1.  That  is,  for  all  i 


2  to  2  tto 

*k+l  *k+l  (l)  -  Vk+1  (xk+l'rk+l=1)  -  *k+l  *Sc+l  (1) 


Then  for  any  j  e  M 


2 

*k+l 


S  P(j»ii 


ieM 


Vi> 


LB 

ViU) 

+ 

Q(i)  . 


*2 


i6M 


vi B(i) 


Vlt+l(*k+l'rk+li 


-  *k+i  : 

ieM 


2  P( j/i; 


Vi3 


Cl(i) 

+ 

Q(i) 


LB  rm 

From  the  definitions  of  (j)  and  K^^Cj)  in  (6.77)  -  (6.78) 


we  thus  have 


Vi  1 

xeM 


Vi  S(i> 


Vi(Vi'Vi,u 


2  ..OB 


-  Vl  Vl'!’. 


Now  define 


\ 

vrixk,j)  -  ■i,i\,,i),"k(iCiii( 
% 


Recall  that 


( 

4+i  «(i) 

VW3) 

=  min  /u  R(j)  + 

M 

i€M 

+ 

Vk+l(xk+l,rk+l=l). 

Thus 


'fWiW-'f'Vi-11  •  (c-8-11 

Solving  for  each  of  the  costs  in  (C.8.1)  we  directly  obtain  (6.73)  -  (6.7< 


Thus  (C.72)  -  (C.78)  holds  for  k,  which  completes  the  proof. 


n 


From  (6.80),  if  we  define  K^(j)  (fo t  each  j6H)  by  the  recursive 


equation 


K^C  j)  -  a2(j)  [2j>jiK]c+l(i)  +  2 (i) ) ) 


R(j)tb2«>  [X  pn  (*&*2(i),l 

L ieC j  J 

where 

V3)  -  vj) 

then  at  each  time  k 


V  (j)  lKk(j) 


(in  all  jeM)  .  Similarly  with  the  p^  definition  of  (6.83)  if  we 
define  K^(j)  by 


^(j)  =»  a2(j)  R(j)  ("J]  p  (Kk+1(i)  +Q(i))| 

ifiCj  J 

R(j)  +  b2(j)  j"  V  Pji  (l^+1(i)  +  Q(i))l 

LieC.  J 

3 

Vi}  "  Vj) 

then  at  each  time  k 


A  direct  application  of  Proposition  3.2  to  (C.9.1)  yields  (6.82)  ,  and 
to  (C.9.2)  yields  (6.85).  Hence  (6.86)  -  (6.87).  _ 


C.10  Proof  of  Proposition  6.11 


811 


Substitution  of 


Vi  s 


2a (1)  R(l) 
b2(l) 


<$N_^{2)  in  £C,10,1)  gives 


(a(l)  5  (2)  +  a)  >  0 

N-J. 


which  yields  (6.161)  directly. 


C. 11  Proof  of  Fact  7.2 


The  evolution  of  K^(l:2)  as  N-k  increases  is  specified  by  (6.96): 
a2(2)  R(2)  [1C,  (1:2)  +  Q(21  ] 

\  d:2)  *  - r - - - 

R(2)  +  b  (2)  [^^(1:2)  +  Q (2)  ]  (C.ll 

Claim:  K]c_1(l:2)  >  ^(1:2)  if  and  only  if  (1*2)  >  K.  (1:2)  . 

To  see  this  we  use  (C.ll.l)  as  follows: 

v2(1:2)  >  Vi(ls2) 


a2 (2)  R(2) [Kk_1(l:2)  +  Q  (2)  } 
R(2)  +  b2 (2) [Kk_1(l:2)  +  Q(2)] 


a2  (2)  R (2)  (K^  (1:2)  +  Q (2)  ] 

R (2)  +  b2  (2)  [^(1:2)  +  Q(2)  } 


(Kk_1(l:2)  +  Q  (2)  ] 


’R(2) 

+ 

b2(2) [K^ (1:2)  +  Q (2) ] 


> 


(^(1:2)  +  Q (2)  3 


R(2) 

+ 

b2  (2)  [K]t_1(l:2)  +  Q (2)  ] 


K]c_1(ls2)  >  ^(1:2) 

as  claimed.  Consequently  since  condition  (7.12)  guarantees  that 
K^_^(l:2)  >  Kjj(1:2)  =  (2)  we  have  (1)  of  Fact  7.2  by  induction. 


,LM 


From  the  recursive  equation  for  (1)  (i.e.  (6. 105)), similar 
algebraic  manipulations  to  those  above  show  that  if  K^d)  >  (1) 

for  some  specified  k  then  since  ^(1:2)  >  K^+1(l:2)  by  (1)  we 


.  /'LM  -'LM  LM.  LM 

have  (1)  >  (1),  hence  x£;  (1)  ?>  kJHD  .  Condition  (7.13) 

...  LM  LM 

with  i  ■  1  guarantees  that  K^^d)  >  (1)  ,  hence  (2)  of  Fact  7.2 

follows  by  induction.  Similarly  we  obtain  (3)  of  fact  7.2  from  (6.104) 
and  (7.23)  with  i  =  2. 


814 


C.12  Proof  of  Fact  7,3 


This  fact  is  proved  by  induction  in  each  of  the  two  cases.  Recall 
the  equations  for  the  endpiece,  middlepiece,  upper  bound  and  lower 
bound  parameters: 


k£®(1)  =  k£S(1) 


a2  (1)  R (1)  K^d) 

R (1)  +  b2(l)  K^d) 


(C.12.1) 


k£®{1)  =  (1  -  “2>(Kk+l(1)  +  Q(1))  +  ^2(Kk+l(1:2)  +  2(2)) 


and 


K^e(l)  =  K*®(1)  =  1^(1) 


a2  (1)  R(l)  K^d) 
R(l)  +  b2(l)  K^d) 


(C.12. 2) 


(C.12. 3) 


K^J1(1)  -  (1  =  V(K£+1(1)  +  2d)  +  aJi(Kk+i(1:2)  +  2(2))  (C.12. 4) 


K^d)  *  ^(1) 


and 


2  ^rjR 

a  (l)  Rd)  K^d) 

2  ~UB 

R(l)  +  bd)  (1) 


(C.12. 5) 


1W1)  ,.1  Ali  (  t  >  [Vl(i)  +2(i)]  (C.12.6) 


LB 

*k  (1) 


K^(l)  =  ^(1) 


a2(l)  R(l)  K^d) 

2  B 

R(l)  +  b*(l)  Kj^d) 


,.i  C.v# 


t  (1)  -  min  t _ ,  1 

M-'Vl  i=1'2 


1^(1)  -  kt(1) 


T  B 

(K^d)  +  Q(i)  i 


(C.12.7) 


(C.12.8) 


LB  UB 

At  all  times  k,  (2)  =  (2)  =  K^(l:2),  where  K^(l:2)  is  given  by 

(6.96)  H(c.ll.l). 

Suppose  u2  >  as  in  (1)  of  Fact  7.3.y  Assume  that 
k£®^(1)  ■  I^^d)  at  some  time  k+1.  Then  by  Facts  7.1  and  7.2, 


UD 

K^+^(l)  in  (C.12.6)  becomes 


kJJ1(1)  -  d  -  w2)  (K^d)  +  Q(D)  +  w2(Kk+l(ll2)  +  2(2)) 


Le 

*  W1’ 


by  (C. 12.2)  • 


Hence  by  (C.12.1),  (C.12.5)  we  have  that 


L6 


and  induction. 


C.13  Commensurate  Goals  Problem  Derivation  -  Part  I 


We  are  considering  the  following  problem  class: 


min 


N-l 


{N-l 

z  [u 


2 

K 


R(V 


+  X 


u+1 


Q(r, 


yc+r 


+  x. 


N 


Vr 


"1 


(C.13.1) 


where 


Vl  =  a(rKJ  **  +  b(V  >  a K 

rK  e  {1,2} 

p  (1,2  :x)  as  |  if  |xj  <  a 

I  (jj2  if  |  x|  >  a 

p(l,l:x)  as  1  -  p (1 , 2  :x)  p(2/l)  »  0 

where 

0  1  V1}  1  V2) 

0  <  a(l)  <a(2) 

0  <  wi  r  <*>2  <1 

0  <  a,  R (1) ,  R (2) 
b(l)  ,  b(2)  *  0 

We  assume 

a2(l)  <  b2  (1)/R(1) 

a2 (2)  ”  b2(2)/R(2) 


(C.13. 2) 


0  <  a 


(C.13. 3) 


p(2,2)  as  1 


(C.13. 4) 


(C.13. 5) 


sequences  as  (N  -  k)  increases.  We  also  assume  that 


•  the  endpieces  (1)  =  (1)  are  grven  by 


the  saine  function  of  as  the  upper  bound  (1) 

•  The  middlepiece  5  C*11  is  given  by  the 

same  function  of  x  as  the  lower  bound  V^B(1) 

Jv  K 


In  addition  we  assume  that 


b2(l)  /KT(1)\  /V2) 

a(1)  <  1  +  Mir  1  -V  +  +  M  - 


b2(l) 


V2) 


(C.13.9) 


hence  we  have  "situation  (1)"  as  in  (7.22)  and  figures  7.11(a),  7.14(d). 
In  section  6.6  we  obtained  the  complete  solution  for  this  problem  at  time 
K  =  N-l:  (6.112)  -  (6.118)  , (6.121)  -  (6.142)  (€hat  is,  fact  6.9).  We 
can  show  that 


Vi(3:1)  <  Vi(1:1)  *  Vi(5:1)  "  W2ll)  =  Vi(4:1) 


(C.13.10) 


(as  in  figure  7.14(c))  by  the  following  lemma: 


Lemma  C . 13 . 1 : 

2  2 

For  finite  >  K2  >  0,  R  >  0  and  a  ,  b  we  have  the  following: 

(1)  K.  >  K_  if  and  only  if  a  RKL  >  a  R  K2 

12  '  - 5 —  - 2 -  (C.13.11) 

R+b  K.  R  +  b  K„ 


(2)  For  any  finite  K,  &  _R  >  - — 


R  +  b  K 


Proof:  For  (1),  note  the  following  equivalences: 


2  2 
a  R  K,  a  R  K 

1  >  -  2 


R  +  b2KL  R  +  b2K2 


2  2 
a  R  K, 


2  2 

a  R  K_ 


*4 


a2  Rb2 


2  2 

a  Rb  K1K2 


A _ v  2  2  2  2 

N  )  a  R  >  a  R  K2  ^ 


2  _2 


>  K„ 


Now  note  that 


/AK_\  (  a^R  \ 

\p+b2  )  “  ““  V  R/K  +  b2  / 


a2R 


lim  l  2 

K~->+b 


lim 


hence  (2)  follows  from  (1) 


Since  in  case  1  we  have 

VD  E  V3)  >  *n(2)  ' 

(1)  of  Lemma  C.13.1  and  (6.128),  (6.136)  yield 


Vl(l8l)  -  Vl(5:1)  >  Kn-1^3s1) 


(C.13.12) 


(C.13.13) 


and  (2)  of  Lemma  C.13.1  yields 


Vi(2:1)  =  Vi(4:1)  >  Vi(1:1) 


(C.13.14) 


Since  ^^(2:1)  =  V^2'L  and  V^p-.l)  =  V2'’  we  have 


Vi(2:1)  = 


Vi  Vi(2:1) 


Vi(2:1) 


(C.13.15) 


XN-1  HN-1(2:1)  j  >  xN_x  Vl(3:1)  =  Vl(3sl) 


except  for  equality  at  xN_^  =  * 


Let  us  now  consider  time  K  =  N-2.  Among  the  eligible  candidate 

cost-to-go  functions  for  V  (x  ,r  *1)  are: 

N-2  N-2  N— 2 

(V2'X)  -  V2  tt-2(t)  +  XN-2  HN-2(t>  +  GN-2(t)  (C‘13‘16) 


0N-2(t) 


i  *H-2  1 


®N-2(t) 


where  (see  appendix  C.l)  we  have 


a  (1)  R (1)  ^N_1(t) 


R(l)  +  b^(l)  K^_x (t) 


(C.13.17) 


V2ai  -  V2,3)  *  V2141  -  W5>  -  V2<7>  -  0 


8H-2<2> 


a(l)  Rd)  H.,  ,{2) 

- r~^ -  -  -V2‘61 

R(l)  +  b  (1)  K^_L (2) 


(C.13.18) 


In  the  sense  of  section  7.2 


;N-2(1)  =  GN-2(3)  =  Gn-2(4)  =  V2(5)  *  V2(7)  ==  0 


GN-2(2)  =  Gn-1(2)  ~  b2(1)  CVi(2^2 

4[R(l)+b2(l)KN_1(2)] 


=  V2(6) 


(C.13.19) 


Q(l) 

2(12) 

Vi(tl  -  <i-»2) 

+ 

+  W2 

- 

A 

Q(l) 

Q  (2) 

Kn_1(4)  =  (1-^) 

•f 

+  tol 

+ 

^W3:1)- 

k-i(i:2)-* 

WtJ 


(t-2 : 1)  * 

"Vi(1:2)‘ 

+ 

+  (jj~ 

2 

+ 

Q  (1) 

2(2) 

m 

L. 

t  -  5,6,7 


(C.13.20) 

A 

Vl(t)  *  0  t=l,3,4,5,7 


HM-1<2)  *  <1-“2)  Vl(2:1) 


(l-u>2)2a(l)  R(l)o 

' 


-Vil6> 

(C. 13.21) 


Vl(t)  =  0  t-1,3,4,5,7 

"N-l{2)  *  (1~w2)  GN-1(2:1)  *  (1-<ya2(R(D  +  b2(l)  ^  (2)) 


and 


W*'  *  Vl11'11 


[” 


0N-2(t) 


b  (1)  ^(t)' 


R(l) 


■  WC)  [X  * 


R(l) 


♦  ^  Vi1'1 

2R(1)  1 


(C. 13 .22) 


t=2 ,3,.. .,7 


b  (1)  Vi(t>  I  .  b2(i) 


I 

+ 


(C. 13 .23) 


2R(1) 


HN-l(t) 


where 


yN-l(1)  =  'V-l*15  YN-1(3)  ■  a  yn-i(5)  “  6n-i(3) 

Vl(2)  =  Vi(2)  yn-i(4)  =  a  yn-i(6)  =  Vi(4) 


Given  (C.13.10),  ^  ^  and  (1*2) 


we  have 

(C.13.24) 

>  Vl'11  E  Vi<7>  >  Vi<3>  =  Vi  (5)  >  Vi'4> 


by  Lemma  C.13.1,  and  thus 


V4'U  <  v3'U  =  v5'U  <  v7'u 
N-2  N-2  "  N-2  VN-2 


=  V. 


1,U 


(C.13.25) 


N-2 


at  all  xN_2  . 


By  (C.13.15)  we  also  have 

except  at  equality 


v2'u  >  v3'u 

N-2  N-2 


at 


6,U  5,U 

N-2  N-2 


except  equality 
at 


y2»> 

xN-2  “  a(l) 


9N-2(-6) 


(C. 13.26) 


’N-2  a (1) 


824 


Note  that  (C. 13 .24)  -  (C. 13.26)  are  the  same  as  (7.26)  -  (7.28). 


In  addition  to  the  V 


t,U 


N-2 


(t=l, . . . ,7) ,  the  other  two  eligible 


candidate  functions  for  VN  ^ (x^  ^ , r^  ^=1)  are 


.,4  ,  L  .  . . 

7N-2  ^N— 2 '  rN-2=*1 


4  /  R  t  . 

VN-2  (xN-2,rN-2=1) 


V: 


*N-2  W4) 


+  G  '  (4,  4) 

N-2) 


where 


(1) 


=  a2(l)  R(l)/b2(l) 

(C. 13 .26) 

Hn_2(3)  =  2a (1)  R(l)  a/b2  (1) 

(C.13.27) 

Hm  . (4)  -  -2a (1)  R (1)  a/b2 (1) 

N— Z 

(C.13.28) 

G  (3,4)  *  a2  [K  (4)  +  ^ - 

N-2  ~N-1  b2(u 

1  -  V2(4,4) 

(C.13.29) 

The  relationships  (7.29)  -  (7.30) 

are  by  definition. 

and  Lemma  C.13. 

yields 

v2 >  v2(2)  >  v2(1) 

• 

(C.13.30) 

When  the  first  situation  for  VN_2  ^-2  'ru_2=1^  occurs  (as  shown  in 
figure  7.lfe)  we  have 


(1) 


From  Appendix  C.l. 


(C. 13 .31) 


VN-2(i;X)  *  *N-2  KN-2(1:1)  +  *N-2  HN-2(l:1)  +  GN-2(lll) 


f°r  <SN-2(i_1)  -  *N-2  -  6N-2(1) 
with  i=l,2, . ,  /  ,jnN^2  (1) 

where 

%-2(1)  "  9- 

Here 


KN-2(l:1) 

ll 

1 

to 

H* 

i=l / 2 , 3 

W4!l) 

^N-2  KN-2(6i 

;1) 

KM— 2 (5sl* 

-  v2(4) 

KH-2U:1) 

‘  KH-211'21 

i»7,8,9 

H„  (i:l) 

«  G„  „(i:l)  *  0 

i=l ,  3 , 5 , 7 , 9 

(C.13.32) 


(C.13.33) 


v2<4) 


a(l) 


a(l) 


-  -V2(S) 


(C.13.34) 


and 


6N-2(3)  *  -V2(6> 


SS-2 (1)  “'W8’ 


Joining  point  <$,  -(1)  occurs  at  the  leftmost  intersection  of  the  functions 
N— 2. 


v1'0  andV2'0 
N-2  N-2 


This  is  the  least 


v2 


such  that 


Vh12  -  VN-2  ’  4-2  1Kn-2(2>  -  WWI  +  S-2  V2(2>  +  W2’  “  °* 


From  figure  7.15  we  see  that  this  intersection  exists  for  all  parameter 
values  consistent  with  the  assumptions  of  this  section. 

Using  the  quadratic  formula  we  find  this  point  to  be 


*1 


[R  (1)  +fc>2  (1)  K  (2)  ] 

_ _ 

[R(U  +  b2(l)  ^(1)][r(1)  +  b2(l)  KN_i(l)3 


R(l)  +  b2(l)  ^(2) 
-a2(l)  R2  (1)  ( 1  qj2 ) 

R (1)  +  b2(l)  £^(2) 


Cl  +  loF"  V2)1  11  +  tot1  {(1-^2)Q(1)  +  “2{Q(2)  +  Vl(1:2))l3 


b  (1)  ( 


b2  (1) 


+  rTI)  V2)  (1  -  “2>  a  (1) 


V«J  E^l ^ 


R  (1) 


(C.13.36) 


Joining  point  6  (3)  occurs  at  the  leftmost  intersection  of  the  functions 


v3,U  and  .  Using  the  quadratic  formula,  we  find  this  to  be 

N-2  N 


V2(3) 


— ot 
a(l) 


(1  + 


b2(l) 

R(l) 


R(l)  +  b2(l)  (4) 

R (1)  +  b2(l)  K^O) 


(C.13.38) 


Since  ^^(4)  intersection  of  V3^  and  Vn-2  always 

exists  (for  the  problems  of  this  Section) .  This  completes  the  derivation 

of  V  (x„  _,r„  *1)  for  the  situation  that  is  shown  in  figure  7.16. 

N-2  N-2  N-2 

Next  let  us  consider  the  situation  shown  in  figure  ..17.  Here 


(C.13.31)  holds  with 


ij-? 


(1)  »  7 


(C.13.39 


v2(iil) 

=  Kn_2 (i)  i*l/2lfe,7 

V2(4il) 

*  V2(4)  ia6'7 

W3*1) 

=  V2=kn-2(5:1) 

HN-2(l:1) 

"  GN_2(i:D  =  0  i=*l,  4, 7 

(C.13.40) 

hn-2(2;1) 

=  hn-2(2)  =  m  "hn-2(6:1) 

W2:1) 

-  w2)  58  V2(6)  *  V2(6:1) 

hn-2(3:1) 

-  V2(3)  =  'W45  -  -V2(5il) 

v2(3:1) 

=  W3'4)  -  W4'4)  =V2(5il) 

with 

V2<0> 

:=  -  °°  6  (?)  -  +  ® 

V2!3) 

\-2W  "W41  . 

“  h5N-2{4) 

a(l)  a (1) 

(C.13.41) 

and 

v2(1) 

*  ‘W*1  * 

(C.13.42) 

v2(2) 

*  ‘W5’ 

Jointing  point 

&  _(1)  occurs  at  the  leftmost  intersection  of  the 
N-2 

function* 

and  V2;2  ,  which  we  have  already  computed  in 

(C. 13.35)  . 

Joining  point  5 

„  -(2)  is  the  leftmost  intersection  ^ 
N— 2 

of  V4'^  and 

(1) 


This  intersection  must  exist  for  the  situation  of  figure  7.17  to  occur 


That  is,  it  is  the  least  *N_2  such  that 


VN-2  -  VN-2  *  Vi  [V-2  '  V-21211  *  V-2  ‘V2(3> 


Vi'211 


*  ‘Vi13’41  ■  Vi1211  -  0 


Using  the  quadratic  formula  we  find  that 


<W2)  - 


-a 


a  (1) 


2 

r 

i  + 

(  (1-W  ) Q (1) 

1  +>/l  -  x2 

R(l) 

< 

(  +  w2 (Q (2)  +  Kn_1(1:2) ) 

- 

L.  - 

where 


(C. 13.43) 


x2  "  (i  “  V 


(1  + 


b2(l) 

R(l) 


L ♦  »’<» 

y  R  (l) 

_+  oo2  (Q(2>  +  KN_1  (1 :2)  )J  / 

b2(1)  2 

+  - -  K„<2>  (1  -  u>  )  a  (1) 

R  (1)  ^  2 


1  + 


b2(l) 

R(l) 


j(l-  UJ2)Q(1)  +  W2(Q(2)  +  KN_1(1:2))  j' 


(C.13.44) 

We  note  in  passing  that,  in  general,  there  need  not  be  any  intersection 

of  the  functions  ^12^-2^  and  VN-2  ^XN-2^  ’  That  is»  we  can  ^ave 
X.  1.  The  condition  v  <  i 

2  o 


is  necessary  for  the  situation  of  figure  7.17  to  occur. 


83 


Now  let  us  consider  the  situation  shown  in  figure  7.18. 

Here 

(C.13.31)  holds  with 

m 

n- 

.2(1)  *  5 

(C.13.45) 

and 

V2(1,l) 

=  V2(1) 

-*✓ 

V2(2:1) 

-  Vj  ■  V2(4i1) 

V2(3:1) 

-  V2<4) 

v2(5:1) 

-  1W7> 

«N-2(i:1) 

=  Ga_2(i:l)  -  0  i-1,3,5 

(C . 13 .46) 

HN-2(2:1) 

-  W31  -  -V2141  -  -V2(4,1) 

GN-2(2s1) 

*  V2<3-41  *  Vj‘4'«  ‘  W4,l) 

*ith 

W°>  ~ 

V2<5)  *  4 " 

e 

6..  ,(2)  *  - 

N-2(41  -V2(4> 

— —  =  — £-= —  =  -a  _  (3) 

(C. 13 .47) 

4/L 


Joining  point  £  ,  „(1)  is  the  leftmost  intersection  of  v  and  V'  „ 

N— 2  N-2  N“2 


That  is, the  least  suc^  that 


V^2  -  V^2  -  >4-2  [V2  -  Vi'111  +  V2  V2(3))  +  0»-2(3'4>  =  0 


Using  the  quadratic  formula  we  find  that 


<W1}  - 


a(l) 


„  k2  „  (  /  R  (1)  + 

-  (1  +  ^-^-K  (.))  (l  +  *  \  ~  - 

.)  R(l)  '  R  (1)  + 


b  (1)  **-l 


(4) 


(i)  +  b‘(D  Vi(1) 


fC-13.48) 


A  A  4  l  1  U 

Since  XN_1(4)  <  ^(1),  this  intersection  of  VN'2  and  V  ' 2  always  exists 


(for  the  problems  of  this  section) . 


In  fact  7.4  we  list  several  graphical  conditions  on  the  candidate 

expected  costs-to-go  for  V„  „ (x„  _,r„  =1)  that  relate  to  the  three 

N—2  N— 2  N— 2 


situations  described  above.  In  particular 


situation  (1) 

(C. 13.31)  -  (C.13.38) 
figure  7.16 


Leftmost  intersection  of 


and  vj:  to  the  right 


°f  0N-2  (3)/a(1) 


*(C.13 .49) 


situation  (3) 
(C.13.31) , (C. 13.45)- 
v(C.13.48) 


Leftmost  intersection  of 


4 ,L  1,U 

V  and  V  to  the  left  of 
N-z  N-2 


(or  at)  leftmost  intersection 


(C.13.50; 


832 


Now  from  (C- 13.22)  and  (C.13.38),  the  right  hand  side  of  (C. 13.49)  becomes 


(7.32).  Since  the  denominator  of  0.31)  is  grater  than  one,  any  a(l) 
satisfying  (7.32)  will  be  consistent  with  our  assumption  (C.13.9) .  Note 
that 


1  <  1  + 


R(l)  +  b2 (1)  K^_1(4) 
R(l)  +  b2(l)  Kn_]_(3) 


<  2 


(since  K^^d)  <  Kj^O).)  ?  Thus  we  have  that  (7.31)  •  '  holds  if  (7.35) 
holds. 


We  can  obtain  the  less  conservative  ,  sufficient  condition*  (734), (7.3&) 
substituting  in  (7.32)  (733  the  values 


Kn_1(4)  =  (1-u,)  (Kn_1(3:1)  +  Q(l))  +u>,  (K^dd)  +  Q  (2)  ) 

Kn_1(3)  =  (l-oi2)  (Kn_1(3:1)  +  Q(D)  +  0)z(Kn-1(1:2)  +  Q(2)) 

Kn_,  (1)  =  0'^UN.iCv) -wQCO) Q(^ 

Since,  by  facts  7.1  -  7.3, 


V1}  iVi(3:1)  <  Vi(1:I)<kn-'<i:1) 


(C. 13 .51) 


we  have 


(C.13.52) 


b2(l)  , 
R(l)  2 


-  Qd) )) 


2 

1+  (^  .  (1:2)  +  (1-U)  )  Q(l)  +o),  Q(2)) 

R(l)  C 


<  1 


Rd)  +  b  (1)  K  (4)> 

N-l 


R(l)  +  b"(l)  KN_1(3)y 


R(l)  {  2 


~aJl)|KN-l(1:2)  +  2(D-Q0)- Kt(') 


1  +  ; 


b2(l) 


R(l)  ((l-Wj)  (Kt(1)  +  Q(l)  +  u>2(Kn_1(1s2)  +  Q  (2) ) ) 


833 


4D-A131  383  FAULT  TOLERANT  OPTIMAL  CONTROLS)  MASSACHUSETTS  INST  OF  18/10  \] 

'  TECH  CAMBRIDGE  LAB  FOR  INFORMATION  AND  DECISION  SVSTEMS  1 

H  JCCHIZECK  AUG  82  LIDS-TH-1260  N00014-77-C-0224 

F/G  9/2 


UNCLASSIFIED 


NL 


MICROCOPY  RESOLUTION  TEST  CHART 

NATIONAL  BUREAU  Of  STANDARDS-1963-A 


From  (7.32)  and  the  right  two  terms  of  (C,13.52)  we  get  (7.34) .  From 

(7.32)  and  the  left  two  terms  of  (C.13.52)  we  obtain  (7.37), 

From  (C.13.35)  and  (C.13.4|8)  the  right  hand  side  of  (C. 13.50)  becomes 

(7.33)  •  Using  (C.13.51) ,  we  cam  derive  the  inequalities 


Proof  of  ProDosition  7.7 


Proposition  7.7  is  proved  inductively  for  decreasing  values  of 
(N-k) .  From  Fact  6.9  we  see  that  the  proposition  holds  at  time  (N-l) . 

In  appendix  C.13  it  is  shown  that  Proposition  7.7  holds  at  time  (N-2)  . 

We  prove  that  Proposition  7.7  holds  at  all  times  (N-2,)  by  an  induction 
on  l,  beginning  with  2,  =  2. 

Suppose  that  Proposition  7.7  holds  at  time  (N-2,+1)  .  We  will  show 
that  it  holds  at  N-2,  as  well. 

We  first  show  that  Proposition  7.7(4)  holds?  that  is,  the  grid  points 
in  thr  composite  partition  of  xN_k+^  obey  (7.44)  for  k  =  2,  .  This  is  clearly 
true  if  we  have 


<w‘21-2> '  - « 


(C.14.1) 


“  <  V  U(2t  -  » 


(C.14.2) 


From  Proposition  7.7(4)  (for  k  =  2,-1)  we  have 


N-2,+1 


(2i-2>  -  -Vui'2*-1’ 


Thus  we  need  only  verify  (C.14.1)  to  prove  that  Proposition  7.7(4)  holds 


for  k=2,  . 

We  have  assumed  (by  7.12)  and  Fact  7.6(1))  that 

a(1)  <1  +  wr  V2) 

and  since  1(^(2)  =  K^1  and  {K^^} 

increases  with  (N-k) ,  we  have 

a (IX  1+  rrr^-  .for  all  k  . 


Since  (7.44)  and  Proposition  7.7(5)  hold  for  k  =  l-l,  we  have 


*  Vt*i(!l> 

hence 

*(1>  <  1  +  Vui'2*> 

Therefore  we  have1 


-a 
a  (1) 


(1  + 


b2(l) 

R(l) 


W(2l)  "  -a 


(C.14.3) 


By  (7.68)  the  left-hand  side  of  (C.14.3)  is  $N_£+1 (21-2) .  Thus  (C.14.1) 
holds,  so  we  have  verified  Proposition  7.7(4)  for  k  *  Jt. 

The  composite  partition  and  the  eligible  condidate  cost-to-go 

functions  for  vH_f, are  s^own  ^9ure  C.14.1.  The  formulas 
for  the  parameters  of  each  of  these  candidate  cost  functions  and  associated 
control  laws  cure  given  in  Appendices  C.l  -  C.4. 

Using  the  fact  that  Proposition  7.7(4)  is  true  for  Z  =  k  (verified 
above)  and  that  Proposition  7.7(9-10)  are  true  for  i  *  k-1  (by  assumption) 
we  can  simultaneously  verify  items  1-3  and  5-8  of  Proposition  7.7  for  S-=  k. 
Then  we  will  prove  that  Proposition-  7.7(9,10),  hold  for  i,=k,  to  complete 
the  inductive  step. 

Given  the  composite  partition  of  (7.44),  we  can  use  Proposition 

5.2  to  list  the  eligible  candidate  costs: 

vjj:?  t  =  i . 4i-i 

V2*,L 


1 


a  >  o 


60> 

N*l*l  «  •  • 


N-i*l 


. 

x-S.*l 


dUl-2)  -o<  0 
N-*»l 


8(11-11  8(11) 
N~l*» 


A(U*0 

N-l*( 


(11)  M*i 
-X.I  4-1 


5(»8-D 

S  (41-4) 

N-l»l •  •  • 

I 

H- 

-1.-1 

1 

1 

8(21*0  *  *  * 
rt-A*t 

! 

N-l*l 

*)| 

A  (41-0 

if  •  .  • 

Figure  C.14.1  Composite  x-Partition  Grid  Points  and 
Intervals. 


Formulas  for  the  parameters  of  these  cost  functions  are  given  in 


Appendices  C.l  -  C.4.  Now  given  that  Proposition  7.7(9)  is  true  at 
k  *  &-1  we  have 


v2l-l,U  2A-3,U 

N-A  N-A 


< 


as  shown  in  figure  C.14.2. 


(C.14.4) 


From  Facts  7.1  -  7.3  we  know  the  formulas  for  the  controller  end- 
pieces 


=  v1,u  =  V  (1*1) 

N  -A  VN-Al  x; 

4J-1  n 

-  Vi  ’  -  WVl111'11 

and  the  middlepiece 

V1"  (1) 

N-r  J 

-  Vl'Vi'1’/2  +  l!l>  -  vH-i° 

• 

Since  the  middlepiece  cost  function  in  (C.14.5)  is  also  the  lower  bound 
cost  function 


V»  (1)  .V»(l) 


tt-l 


N-l 


'N-A 


(by  Facts  7.1  -  7.3),  it  is  optimal  over  its  entire  region  of  validity: 

^N-A(1)  ~  a(l)  (1  +  R(l) J  KN-A+1C*,>)  <  xn-A  <IUT  (1 
+  R(l)}  KN-A+1<2*),"*H-A<l,t 

Now  let  us  consider  VN_£  (XN4'rN-Jl=1)  from  XN-1=0  leftwards  (for 

2%,  L  2 2,  U 

increasingly  negative  x„  s).  Since  V„  »  intersects  V„  '  at  6„  „ (1) , 

N-X,  N-X,  N-A  -N-A 


it  is  clearly  optimal  immediately  to  the  left  of  5^_^(1),  as  shown  in 


figure  C.14.3. 


2  J,  l 

VN-Jl  remain  optimal  as  we  consider  increasingly  negative 

xn_£  in  figure  C.14.3,  imtil  it  intersects  another  valid  eligible^" 
candidate  cost-to-go  function.  As  shown  in  figure  C.14.3,  this  next 
optimal  piece  of  vn_2^xn_2  rN_2=1*  coincide  with  V1 2^1,U  if, 

'  is  valid  immediately  to  the  left  of  its  intersection  with 

2  j£ 

VN_'  .  If  this  is  the  case  then  this  intersection  is  a  joining  point 

olLi  tj  22-1  U 

of  VN_2 ^XN-2'rN-2=1^  and  VN-2  *  is  °Ptimal  until  VN_2  '  ceases  to 


2  4-  2  U  2 

To  the  left  of  this  point,  V  '  will  be  valid  until  it  intersects 

N— 2 

another  valid  eligible  candidate  cost.  The  next  (to  the  left)  piece 

of  vN_2^XN-2'rN-2=1^  be  Vn-23  U  valid  immediately  to  the 

24—3  u 

left  at  its  leftmost  intersection  with  VN_2  '  .  This  process  continues 

until  vn-12  intersects  -  vjjf2(l). 


When  the  requirements  of  validity  described  above  and  shown  in 
figure  C.14.3  are  met,  then  (1)  -  (3)  and  (5)  -  (8)  of  Proposition  7.7 
hold  for  k  =  4.  That  is,  using  (7.70)  -  (7.71)  for  k  «  4  we  need 


5^(21)  <  *N_4(2i+l)  (C.14.7) 

for  i  =  4-1, ... ,1 

for  (1)  -  (3)  and  (5)  -  (8)  to  be  true. 

1  In  the  sense  of  section  7.2 

2  as  xN_a  decreases 
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Let  us  consider 


W2*-2)<  'SH-t(2ll-1> 


(C.  14.8) 


We  have,  by  (7.71)  with  k  -l  : 


Va{2A’1}  = 


2CKM-2.(2A:1)  “  Vi<2A"1:1)J 


(1  +  /1-*  ) 


(C.145) 


where 


4[KN-Jl(2i':1)  *  gN-A(2A:1) 


V«(2*,w 


(C. 14.10) 


Now  since 


lN_£(2il:l)  =  2a(l)  R(l)  «/b2(l) 


2(11  l 


„  „  &  (1)  R  (1) 

K  -(2A,1)  -  K  t(2A-l:l)  =  -r - - - = - ^ - 

b2(l)  ( R (1 ).  +  b  (1)  Vit+l*2*-15 


we  have 


2 

Vi(2l-l)  -  tut  11  *  Inr  W24-1’ 11  +/I^> 


(C.14.11) 


From  (7.70)  *  (C.14.6)  we  have 

Vi12*-21  ■  -r-.  (1  ♦  Itit  Vi«(2t-2»11  *  iur  W<2,-1)l 

a  (l) 


(C.14.12) 


From  (C.14.11)  and  (C.14.12)  we  have  that  (C.14.8)  holds  if 


a<1)<  xtlnf  Vuii2t-21 

i  + 


(C.14.13) 


842 


1  +  /1-X 


So  (C.14.13)  (hence (C.14,8))  holds  if 


*a)  <I(l  +  i^vw2)l-2,,  • 


(C.14.14) 


Now 


V 


1+2 


(21-2) 


LM 


V 


1+2 


(<) 


so  we  need 


* 


b  (1)  ^LM 


ad)  <  fd  +  rrrr- 


R(l)  N-  £+2 


(C.14.15) 


This  is  guaranteed  to  be  true,  however,  since  by  (7.35)  we  have  assumed 


2 

ad)  <  t  d  +  Trrrr1  ^(2)5 


R(l) 


=  I  „  +  b  dl  ^  LM( 

2  U  +  R (1)  K  N  U)) 


(C.14.16) 


'lm 


and  (by  Facts  7.1  -  7.3),  increases  monotonely  as  (N-k) -*■  <».  so 

(C.14.7)  holds  for  i  «*£  -1.  For  other  values  of  i  in  (C.14.7)  we  need 


5N-$/2i)  <  5N-i/2i+1) 


i  =  1,2, ... , 1-2 

to  hold  which,  by  (7.70)  -  (7.71)  (with  k  =£  )f  requires  that 


+  ‘dir 

a(l)  rxxx  -i+1 


trlf  K.  ..,(2i+l) 

(2i) )  n  |l  +  ~ ^ - 


j*i  L 


ad) 


-HN_g(2it2tl) 


2[Vi(2i+2,1)  ■  Vn{2i+l!l)] 


d  +  Si-xj 


(C.14.18) 


(C. 14.19) 


where  0  <  X-  <  1 
Substituting  for  the  parameters  in  (C.14.17)  yields  the  requirement 


•  a(l)  < 


1  +  /Pxi 


which  . ,  by  (C .  14 . 19 )jh guaranteed  if 


a(1)  "7  (1  +  ioT  Wt2i,) 


But 


V 


i+l(2i)  - 


LM 


aLM 


Vi+l{1)  < 


(1) 


(C.14.20) 


(C.14.21) 


so  (C.14.15)  guarantees  that  (C.14.21)  holds  for  i  =  1,2,...,  1-2.  Thus 
condition  (7.35)  =  (C.14.16)  results  in  the  situation  of  figure  C.14.2 
That  is,  (1)  -  (8)  of  Proposition  7.7  hold  for  k  =  l  . 

Given  (1)  -  (8) ,  it  is  easily  verified  that  (9)  -  (10)  of 
Proposition  7.7  hold, using  Lemma  C.13.1.  This  completes  the  inductive  step 
(on  l  ),  and  the  proof  of  Proposition  7.7. 

O 


8 


▼ 


C. 15  Conflicting  Goals  Problem  Derivation 

We  are  considering  problems  of  the  class  (C.13.1)  -  (C.13.7) 
where,  instead  of  (C.13.8)  we  have 


0) 


1 


> 


(C.15.1) 


hence 


,  by  Pact  7.3  it  follows  that-. 

the  endpieces  (1)  =  V^e(l)  are  given  are  given  by  the 
same  function  of  as  the  lower  bound  (1) 


the  middlepiece  Vj^d)  =  V^d)  is  given  by  the  same 


.UB, 


function  of  as  the  upper  bound  V^.  (1) 


That  is,  we  have  the  opposite  of  the  situation  in  section  7.5  and 
Appendix  C.13. 

We  will  also  assume  that  (7.14)  holds: 


a(l)  <  1  + 


(> 


b2  (1) 


R<1) 


V2> 


Rd)  +  b2ci)  ILd) 


Ed)  +  b2-(l)  * 


Iv,(2)  / 


(C.15.2) 


hence  we  have  "situation  (1)"  of  table  7.2  and  figure  7.23(d)  applies. 

In  section  6.6  we  obtained  the  complete  solution  for  this  problem  at 
k'«  N-l;  it  is  specified  by  (6.112)  -  (6.116),  (6.119)  -  (6.120),  (6.147) 
(6.154)  and  figures  6.14  -  6.16.  Using  lemma  C.13.1  we  can  show  that, 
as  in  figure  7.23(c): 


Kn_1(2:1)  =  Kn_1(4:1)  >  KN_1(3:1)>  K^dd)  "  Vl(5,1)*  (C*15*3) 

This  is  done  as  follows:  since  we  have 

V2)  <  kn(1)  31  V3)  ' 


845 


then  (1)  of  Lemma  C.13.1  and  (6.128),  (6.130)  yield 


Vi(l5l)  s  W5sl>  <kn-i(3:1) 


The  other  inequality  in  (C.15.3)  is  obtained  using  (2)  of  Lemma  C.13.1. 

Since  V  (2:1)  =  and  V  .  (1:1)  =  we  have 

N-l  N-l  N-l  N— 1 


except  for  equality  at  xN_^  =  <SN_1  (1)  . 

Now  let  us  consider  time  K  =  N-2  for  this  problem.  Among  the  eligible 
candidate  cost-to-go  functions  for  ^ (xN_2 \rN_2=1^  are 


,t,U 


N-2  (XN-2,1J  XN-2  KN-2(t)  +  XN-2  HN-2 


(t) 


+  GN-2 (t) 


(C.15.5) 


for 


9N-2(t)  <  <  0N-2(t) 

a(l)  -  XN-2  -  a (1) 


for  t  =  1 , 2 , . . . , 7 


where  (see  appendix  C.l)  we  have  the  parameters  in  (C.15.5)  as  given  by 
(C.13.17)  -  (C. 13. 23) . 

UB 

Given  (C.15.3),  u  >  and  K^U)  <  *^(1:2) 
we  have 

VlUI  -  *n-i(7>  <  W3>  -  W5)  <  Vl14) 


(C.15.6) 


by  Lemma  C.  1311(1)  and  thus 


4,U  3,U  5,U  7  ,U  1,U 

j  9  >  v  '  —  \j  9  >  V  '  # 


N-2  >  V2  =  V2  >  V2  =  ^  N-2  **  *«-Z 


(C  .15.7) 


By  (C.15.6)  we  also  have 


2  ,U  1  ,U 

XT  9  >  XT  9 

N-2  N-2 


except  equality  at  xN_2  =  0N_2(2)/a(l) 


6,U  7,U 

N-2  N-2 


except  equality  at  xN_2  =  9N_2 C“ ) / a ( 1 ) 


That  is,  we  have  verified  (7.106)  -  (7.108)  and  figure  7.24. 


,t,U, 


In  addition  to  the  v  _ (t=l, . . . ,7)  the  other  two  eligible  candidate 

N-2 


functions  for  v  _  (x  ,r  =1)  are 
N-2  N-2  N-2 


VN-2(XN-2,rN-2~1)  XN-2  ^-2  +  XN-2  HN-2(3)  f  GN-2(3,3) 


VN-2(XN-2,rN-2_1)  "  XN-2  *Sj-2  +  XN-2  HN-2(4)  +  GN-2(4,5> 


where  K,  If,  _(3)  and  H„  _(4)  are  given  by  (C.13.26)  -  (C.13.28)  and 
N— 2  N-2  N-2 


GN-2(3'3) 


a 


b  (1) 


GN-2(4'5) 


When  the  first  situation  for  V„  _(x.,  _,r„  =1)  occurs  (as  shown  in 

N— 2  N— 2  N— 2 

figure  7.25)  we  have 


VN-2ti:l)  "  *N-2  KN-2(1i1)  +  XK-2  HN-2(i!l)  +  GN-2(l:1) 


£or  5S-2(i!l)  i  *H-2  i  SN-2li> 
with  i  *  1,2, . . . ,mN_2 (1) 


(C.15.8) 


where 


nL,_,(l)  -  9 


(C.15.9) 


Here 


*Wi:1)  =  V2(i)  1  -  ^2,3 

V2‘4!l)  =  V2  =  kn-2{6:1) 

V2t5:1>  -  v2(4) 

KN-2(i;1)  =  v2(i“2)  1  a  7'8'9 

HN_2(i:i)  -  V2(i!l)  =  0  i  “  1^3, 5, 7,9 

HN-2(2:1)  =  Hn-2(2)  =  "Hn-2(6)  =  _Hn-2(8:1)  (C.15.10) 

GN-2(2s1)  a  GN-2(2)  "  GN-2(6)  “  GN-2(8s1> 

V2(4s1)  “  5n-2(3)  =  "V2(4)  "  “HN-2(6:1) 

GN-2(4s1)  =  Gn-2(3,3)  =  GN-2(4,4)  =  GN-2(6:1) 


with 

v2(0>  =  - 00  V2(9)  * +  °° 


W11  ■ 

®N-2<1) 

.  W2’ 

'®H-215>  -V2(7> 

a(l) 

a(l) 

a(l)  a(l)  *  '6n-2(8) 

W3>  * 

z® 

i 

M 

1 

u> 

:W5> 

5N-2(6)  (C.15.11) 

a(l) 

a(l) 

and 

V2(2)  - 

-v2(7’ 

v2<4>  ■ 

-V2(S> 

• 

Joining  point  5  (2)  occurs 

N-2 

2/U  •>«»  VN-t  • 

VN-2  A  Thls  Is  t*1®  greatest 

at  the  rightmost  intersection  of  the  functions 

xN_2  such  that 
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T 


-  vn12  -  xiL  [w2)  -  v2<3)  +  vjV, 


(2)  + 


+  V2(2)=°  • 

From  figure  7.2S  we  see  that  this  intersection  exists  for  all  parameter 
values  consistent  with  the  assumptions  of  this  section.  Using  the 
quadratic  formula  we  find  this  point  to  he 


W2> 


r-  (1*airV!l)(1,rWJI)(1  -'/l  -*3) 

U)  '  (C.15.: 


where 


X,  »  — > 


(1  + 


R  (1)  KNU) 


)(1  *  iof  Si-1121)- 


(1  -  u>2) a  (1) 


1  *5717  V2’) 


,  (C.15. 13) 


Joining  point  6N_2(4)  occurs  at  the  rightmost  intersection  of  the 
3  R  4  U 

functions  V  and  v  '  .  Using  the  quadratic  formula,  we  find  this 

N— Z  N-Z 


intersection  to  be 


6H-2(4) 


/  h2m  -  \  /  /  Rll)  +  b2(1)  t(3)  \ 

'  R(1>  ™_1  /  '  R  +  b2(l)  K.,  ,  (4)  /# 


(C.15. 14) 

A  A  3  R 

Since  ^3)  ^(4)  for  this  problem,  this  intersection  of  VN^2  and 

4  U 

VN'2  always  exists  (for  the  problems  of  section  7.6).  This  completes  the 
derivation  of  V„  „(x„  _,r„  *1)  for  the  situation  that  is  shown  in 

N-i  N-^ 

figure  7.25. 

Next  let  us  consider  the  situation  shown  in  figure  7.26.  Here  (C.15. 8) 


holds  with 

mN_2(l)  =  7  (C.15.15) 

and 


KN-2(l:1) 

a 

H 

OJ 

1 

i-l,2,6,7 

v2(4sl) 

a 

Tl* 

(N 

• 

V2(3s1) 

3 

V2  -  v 

.2(5s1) 

Hn.2 

= 

GN-2(l:1) 

i=l,4,7 

«N— 2 (2:1> 

hn-2(2)  = 

‘hn-2{6)  "  ■hn-2(6:1) 

v2(2sl) 

= 

GN-2(2)  “ 

GN-2(6)  =GN-2(6:1) 

(C.15.16) 

v2(3s1) 

= 

W3)  - 

-V2(4)  s-v2(5s1) 

W3sl) 

3 

gn-2(3'3)- 

W4'5)  “gn-2(5:1) 

with 


and 


V 

-2(0) 

A 

•  00 

V2<7)  4  +  " 

5 , 

-,(1) 

%-2a) 

*W7>  «  ,6l 

*  •  - m,  "  *  -0  (6) 

N- 

-2 

a(2) 

a (1)  N-2 

V 

-2<2) 

3 

-W6> 

V 

-2(3) 

a 

-5»-2(4> 

• 

point 

V2 

(2) 

occurs  at 

the  rightmost  intersection 

(C.15.17) 


(C.15.18) 


2,U  3.R 

V.,  _  and  V  _  .  Using  the  quadratic  formula  we  find  that 

N—  Z 


where 


Xi 


(1  + 


b2(l) 

R(l) 


Vi(2))^  + 


b2  (1) 
R(l) 


kn-i(3) 


(l-w2)  (1  + 


b2(l) 

R(l) 


V11] 


+  a2  (1)  (1  -  0i2) 2 

(1  +  Mr  Vi'!>  - a<1,(1  -n2>)2 


J 


(C.15.20) 


2,U 

In  general,  there  need  not  be  any  intersection  of  the  functions  VN  and 

V3'2*  That  is,  we  may  have  V4  >  1  in  (C.15.20).  The  condition  ^4  <_  1 

is  necessary  for  the  situation  of  figure  7.26  to  occur. 

Joining  point  <$N  2(3)  occurs  at  the  rightmost  intersection  of  the 
3  R  4  U 

functions  V  '  and  V  '  ,  which  we  have  already  computed  in  (C.15.14). 

N— 2  N-2 

Now  let  consider  the  situation  shown  in  figure  7.27.  Here  (C.15.8) 


holds  with 


mN_2 (1)  -  5  (C.15.21) 

and 


v2(i:1) 

ai 

KN_2U)  1  =  lf2 

V2(3:1) 

s 

V2(4) 

KN-2(isl) 

- 

Kj,_2(i+2)  i  «  4,5 

HN-2(isl) 

m 

GN.2(i;1)  -  0  i  -  1,3,5 

Hn_2(2:1) 

ss 

HN-2(2)  "  -HN-2(6)  -HN-2(4:1) 

GN-2(2!l) 

38 

GN-2(2)  “GN-2<6)  “GN-2(4!l) 
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with 


A 

(0)  =  -  00 

6 

N-2 

A 

(5)  =  +  00 

V2(1) 

0  (7) 

N-2  1 

—  6  t  A  \ 

(1)  "  all)  ' 

a  (1) 

°N-2(4) 

(2)  -  -<5s_2<3> 

• 

(C. 15. 23) 


(C.15.24) 


Joining  point  6„  _(2)  occurs  at  the  rightmost  intersection  of  the  functions 
N—  2 

2  4  U 

V  and  V  '  .  Using  the  quadratic  formula  we  find  this  intersection  to 

N—  2  N— 2 


-a(l-w2)(R U)  +  b  (1)  (4) 

5  (2)  =  - - - 7T - « - 

.  i  A  /  \  A  /^\ 


b  (1)^(2)  -  Kjj_1  (4) 


(l-J 


1  -X 


(C. 15. 25) 


2  2 

2  M .  (1  2}  (1  R(l)  V2))(1  R(l) 

a  (1) 


(C.15.26) 


where 


b2(l) 


ft1*  Inf  V»> 111  ^uj-q-^uni 


(R  +  b2  (1)  (4))  a2  (1)  (1-oj2) 


(C. 15.27) 


^*6  "  1  +  (“i  “  W2)(1  +  b2(1))(R(1)  +  fa2(1)  V2))(Q(1)  ~  Q<2> 


"  ^-l^12^  (C.15.28) 

Since  V2'^  and  V4'!?  must  have  two  intersections  to  the  left  of  x  =  0/ 
N-2  N— 2 

(C.15.26)  implies  that 

S-l{2)  ?  Vl(4)  •  (C.15.29) 


852 


In  fact  7 . 9  we  list  several  graphical  conditions  on  the  candidate 

expected  costs-to-go  forVx,  _,r  =1)  that  relate  to  the  possible 

N— 2  N— 2 

situations  described  above.  In  particular. 


From  (C.15.12)  and  (C.15.22),  the  right^and  side  of  (C.15.30)  becomes 
(7.113).  From  (C.15.14)  and  (C.15.25),  the  right-hand  side  of  (C.15.31) 
becomes  (7.11+) . 


Using  the  fact  that 

KN_1(2:1)  -  a2  (1)  R(l)/b2(l) 


we  can  rewrite  (C.15.13)  as 


(C.15.32) 


which  leads  to  the  bound 


>  ^  v# +  ^  P1' v  8<1>  1) 

l+  W2(KN-1(1:2)  +  2(2))J/ 

am  T\ 

I  U>2  Q (2)  +  KN_1(X:2)j  I 


(C.15.33) 


which  yields  (7.115)  of  fact  7.11  directly.  To  obtain  (7.116),  we  note 
from  (C.15.13)  that 


v  iiisjstkii 

1  +  rTiF"  V21 


(C.15.34) 


Since  K^U)  >  Thus  since 

V2>  *  ?<«  » 


(by  Fact  7.3),  (C.15.34)  yields  (7.116) 
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C.  16  Proof  of  Proposition  7.12 


Proposition  7.12  is  proved  inductively,  for  decreasing  values  of 
(N-k) .  From  fact  6.10,  we  see  that  the  proposition  holds  at  time  (N-l) . 

In  appendix  C.15,  it  is  shown  that  Proposition  7.12  holds  at  all  times 
(N-i)  by  am  induction  on  i,  beginning  with  £■  2. 

Suppose  that  Proposition  7.12  holds  at  time  (N-2.)  +1.  We  will 
show  that  it  holds  at  time  N-i,  as  well. 

We  first  show  that  Proposition  7.12(4)  holds;  that  is,  the  grid  points 
in  the  composite  partition  of  xN  k+1  obey  (7.112)  for  k  =  i.  Following 
the  argument  for  proving  Proposition  7.7(4)  in  Appendix  C.14,  we  need 
only  verify  that 


W'21'2’  <  ♦ 


(C.16.1) 


Using  the  expression  for  <?N_^+1(2  -2)  given  in  (7.128),  we  have  that 


(C.16.1)  holds  if 


<  i1  t5TirKl.-U2(2t-2>)(1  - 


R(l)  +  b2(l)  K^_£+2 (21-3) 


r(d  +b  a)  Vw(2i-2)  . 


(C.16.2) 


But  (C.16.2)  is  implied  by  assumption  (7.117).  To  see  this,  note  that 
(C.16.2)  can  be  rewritten  as 


fb'd)  [2(2>-2(D+Vl+2(l!21&! 


a(l)  <  1  + 


b2  (1) 


-0^)  [Q(2)-Q(l)  +  ^(l^l-Kjll)] 


b2(l)  K^d) 


(C.16.4) 


Since,  by  Facts  7.1  -  7.3,  the  sequences (k^ (1:2) }  ,  {k^1  (1)}  and 

{k^U) }  increase,  with  decreasing  k,  (C.16.4)  =  (7.117)  is  more  restrictive 

than  (C.16.3)  =  (C.16.2).  Thus  (C.16.1)  holds,  and  we  have  shown  that 

Proposition  7.12(4)  holds  for  k  =  2. 

The  composite  xN  partition  and  the  eligible  candidate  cost-to-go 

functions  for  V  „  (x„  ,  ,r  =  t)  are  shown  in  figure  C.16.‘l  The  formulas 
N-X  N-X  N-X 

for  the  parameters  of  each  of  these  candidate  cost  functions  and 

associated  control  laws  are  given  in  Appendices C.I.-  C.4.  Note  that 
2  o+i  r 

VN  2.  #  eli9ible  311(1  valid  for  xn-2,+i  —  0;  we  can  exclude  itseligi- 

22.  R  22.  R 

bility  in  the  interval  (0,a)  because  VN_£  is  eligible  here  and  VN_^ 


and  V' 


22+1  ,L 


cross  at  *  0. 


Using  the  fact  that  Proposition  7.12(4)  is  true  for  2=  k  (as  verified 
above)  and  that  Proposition  7.12  (10-11)  are  true  for  2  -  k-1  (by 
assumption)  we  can  simultaneously  verify  items  (1)  -  (3)  and  (5)  -  (9) 
of  Proposition  7.12  for  2*  k.  Then  we  will  prove  that  Proposition 
7.12  (10,11)  holds  for  2  »  k,  to  complete  the  inductive  step. 

Given  the  composite  partition  of  (7.122),  we  can  use 

Proposition  5.2  to  list  the  eligible  candidate  costs: 


,2 


10) 


N-I»l 


4UJ-1)  -ot 
n-A*i 


*(21-11  *(1M) 

N-lM  N.l«| 


AUA-0 

N-IL*t 


A  (111 
N*2*t 


of  .  .  4(4i-4) 

N-l*l 

*£&  W— 


A(n.i) 


•  •  • 


AMM) 

N-*»l 


e  C.16.1;  Composite  x-Partition  Grid  Points  and  Intervals 


Now  given  that  (7.134)  of  Preposition  7.12  is  true  at  k  =  l-l ,  we  have 


V1'?  <  v3'?  < 

N-1  N -l 


21-l/U  21, U 

<  V  '  <  V  ' 

N-l  N-l 


(C.16.5) 


as  shown  in  figure  C.16.2. 


From  Facts  7.1  -  7.3  we  know  the  formula  for  the  controller  endpieces 


vn!*(1) 


s  v1,u  =  V  (1-1) 
N-il  N-JT  1 


v*e 


N-l 

and  the  middlepiece 


(1) 


45,-1  U 

V/  =  WW1)8l) 


(1) 


(1)  *  V  (^1^ -  +  1-1)  =V2*'U 

N-il  ^  J  N-il  2  VN -l 


2%  u 

From  figures  C. 16. 1-2  we  see  that  VN_^  is  optimal  over  some  interval 


21  U 

about  zero  because  at  x^^  *  0,  VN_^  is  less  than  the  other  two 


...  .  .22+1, L  .  22 ,R  . 

candidates  (VN_^  and  VN_^  ) 


The  endpiece  functions  V, 


1,U 


N-2 


,.42.-1  ,U  ..  ..  . 

V„  „  are  the  same  as  the  lower 
N-X,  - 


bound  function  (1)  for  this  problem.  Thus  these  functions  are 


optimal  over  their  entire  regions  of  validity: 


VH-2(XH-l'rN-lS‘:l>  “ 


,  „  SN-!(1) 

f°r  Vf- - la r 


VN-i  (XtI-?/rM.S.El1  "  VN-f. 


for  x. 


,  W4  -1} 


'N-2 


a  (1) 


Now- let  us  consider  VN_^  » rN--2,=B3‘ ^  as  we  sweeP  rightwards  from 
x^  .  *  -  00 .  To  the  immediate  right  of 


Ordering  of  Candidate  Cost  Functions 


increases,  until  it 


2  U 

VN'^  is  optimal.  It  will  remain  optimal  as  xN_^ 

intersects  another  eligible  valid  candidate  cost.  This  next  optimal 

cost  will  be  ,  unless  is  not  valid  to  the  right  of  the  V2 ' ^ 

N-X,  . . .  "  N-J6 -  N-JC 

and  intersection  (see  figure  C.16.3).  If  is  valid  immediately 

N-x,  N-x. 

to  the  -  right  of  this  intersection,  then  this  intersection  is  joining 

point  6  0  (2)  .  will  then  be  optimal  (to  the  left  of  6  (2))  until 

N—  x,  N— x,  N— Jo 

W3)  *  V*(3>/a(1>  ' 

where  V2 ceases  to  be  valid  and  V4'^  becomes  optimal  (for  l  >  3) . 

N— x,  N-x,  — 

Then  next  joining  point  will  be  at  the  intersection  of  V4'^  and 

N-x,  N-J6 

(for  i  >  3)  ,  if  V5/Un  is  valid  here. 

—  N-  x. 

This  pattern  continues  (if  the  validity  requirements  shown  in 
figure  C.16.3  are  met)  until,  at 

W2a_1)  53  0n -e(2H)/all>/ 

2i-l  R  22,-1  R 

the  optimal  cost  becomes  V„  '  V„  „  '  is  then  optimal  until  it 

N-x,  •  N-X. 

2Z  U 

intersects  VN_^  (the  middlepiece) . 

When  the  requirements  of  validity  that  are  described  above  and  shown 
in  figure  C.16.3  are  met,  then  (1)  -  (3)  and  (5)  -  (9)  of  Proposition  7.12 
hold  for  k  *  Z.  That  is,  using  (7.129)  -  (7.130)  for  k  *  i  we  need 

<SN-2(2i)  <<SN-£(2i  +  X)  (C.16.6) 

for  i  =  1, . . . ,  £-1 

for  (1)  -  (3)  and  (5)  -  (9)  to  be  true. 

From  (7.129)  -  (7.130)  we  can  rewrite  (C.16.6)  as 


860 


re  e . 16 . 3 i  Finding  the  Optimal  Cost  for  negative  x 
moving  leftwards  from  zero. 


(C.16.7) 


a(l)  <  (1  +  V..|(2i))(1  •  Jl  "V* 


where 


■*„<»)  = 


RCD-^b  (DV£+p(2i) 


R(l)  +  o'  (1)  (2i)  p=l  Rd)  +b  (DKN_Jl+p^2i+1) 


l-\  [a2(l)  R2(l)  (1  -  oj2)]S 


s-1 


,.i  tR<1)  +b2(1)  W,!i"  7T^1)+b2(1>Vi-q+i(2i)J: 

q=l 

(C.16.8)  • 

If  (C.16.7)  holds  at  each  i  =  1,2, — ,  £  -  1  then  so  does  (C.16.6)  and, 
consequently,  (1)  -  (3)  and  (5)  -  (9)  of  Proposition  7.12  holds  for  k  =  £. 
Now 

*LM 


Vwl!il  - 

so,  by  Facts  7.1  -  7.3, 


(C.16.9) 


b2(l) 


b2(l) 


(1  +  R(l)  Vi+l(2l))  >  {1  +  R(i)  ^ 


(2)) 


(C. 16. 10) 


From  (C.16.8)  we  see  that  for  each  i  =  1,...,  2.-1, 

2 


X  (i)> 


R(l)  +  b  (1)  KN_i+1(2i-l) 
R(l)  +  b2(l)  ^..+1(2i) 


(since  K^_^+^(2i)  >  K^_g+^(2i)  by  (10)  of  Proposition  (7.12) 
(C.16.9)  and  (C. 16.11)  we  obtain 


R ( 1 )  +  b  (1)  K^d) 


R (1)  +  b2  (1)  K*f(l) 


(C.16.11) 


From 


(C. 16.12) 


8' 


Since  the  middlepiece  parameter  sequence  {Ki  (1) }  increases  with  (N-k) . 
From  (C.16.7),  (C.16.10)  and  (C.16.12)  we  have  that  if 

(C.16.13) 

then  (C.16.6)  holds.  But  we  have  assumed  (C.16.13)  to  be  true,  since  it 
is  identical  to  (7.116)  of  fact  7.11. 

Given  that  (1)  -  (9)  of  Proposition  7.12  hold  for  k  =  l  ,  it  is 
easily  verified  that  (10)  -  (11)  are  also  true  (using  Lemma  C.13.1). 

This  completes  the  inductive  step  (on  l)  ,  and  therefore  the  proof  o-f 


*‘l>  <  (l  +  ioT  V2i 


R(l)  +  b  (1)  K^d) 
R(l)  +  b2(l)  K^(l 


Proposition  7.12. 


C.17  Proof  of  Proposition  7.14 

Consider  first  a  commensurate  goals  problem  satisfying  the 
assumptions  of  Proposition  7.7.  Applying  the  controller  described 
in  (1)  of  Proposition  7.14,  we  obtain  expected  cost-to-go 
VN_k <xN_k'rN_k=l)  for  (N-k)  <_  (N-p) .  Since  the  controller  that 
we  are  applying  is  suboptimal,  we  have 


VN-k(XN-k,rN-k=1)  -  VN-k(XN-k,rN-k‘1) 


at  each  x^_k  value.  From  fact  7.3(1) ,  we  have  that  the  endpiece 
cost  function  is  an  upperbound: 


VN-k(XN-k,rN-k  1) 


VN!k(1> 


=  V^e  (1)  =  V05 
N-kV  ;  VN-k 


(1)  =  XN-k  KN-kU:1) 


From  (9)  of  Proposition  7.7  we  have  that 

Vk  Vk(2(k-P+ll+lll>  i  WVk'V11 

for  each  k  <_  p,  at  all  not  in  <<SN_k  (2(k-p)),  5N_k(2  (k+p)+l) )  . 

To  prove  (7.145)  it  remains  to  be  shown  that  for  these  x^  k  we 
have 

VN-k(XN-k,rN-k*1)  -  XN-kKN-k(1:1)  #  (C.17.1) 

For  xN-k  >  <$N_k(4k)  and  xN_k  <  <SN_k(l),  the  optimal  expected  cost- 
to  go  and  the  suboptimal  controller's  expected  cost-to-go  coincide; 
that  is,  equality  holds  in  (C.17.1).  For  any  xN  k  satisfying 
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W2'*4*’*1’  <  Vk  <  Vk,41t) 

or 

Vk(1>  4  Vk 4  Vk<2(k-p". 

the  approximate  controller  applies  the  endpiece  laws1 

Vk(l!l) . vk+d-i(l!l)  (or  Vk(4k+lsl) . Wi(4(k;;)+l!l)) 

until,  at  some  time  (N-k+d) ,  it  drives  xN_k+d  inside  the  interval 

(6N_k+d(2(k-d-p)),  6N_k+d(2(k-d+p)+l)).  (C.17.2) 

Then,  starting  with  k+<j*  the  suboptimal  controller  uses  true  optimal 
control  laws.  That  is,  the  expected  cost-to-go  of  the  suboptimal 
controller  once  x  is  inside  (C.17.2)  is  the  optimal  expected  cost- 
to-go.  Since  this  cost,  vN_Ic+<j  (XN-k'rN-k=l*  '  is  bounded  above  by 
x^_k+d  Kjj_k+<j(l:l)  ,  we  have  (C.17.1)  and  thus  (7.145)  holds. 

For  conflicting  goals  problems  satisfying  the  assumptions  of 
Proposition  7.12,  an  analogous  argument  holds  using  fact  7.3(2) 
and  (10)  of  the  Proposition  7.12.  Part  (3)  of  Proposition  7.14 
follows  directly  from  (2) . 

o 


1 


unless  the  system  jumps  to  form  r*2 


C.  18  Proof  of  Proposition  7.16: 

Consider  a  JLQ  problem  specified  by  (5.1)  -  (5.6)  when  the  sub- 
optimal  controller  of  Proposition  7.16(1)  is  applied.  Clearly  at 
aii 


Vk(Vk'rN-k*3  i  Vk‘Vk’tS-k'll  (C-18-1’ 

for  each  j  e  ,  since  the  applied  controller  is  not  optimal.  Recall 
that  the  optimal  JLQ  controller  minimizes,  for  k  >  1: 


N-l  r 


Vk'Vk'Vk1  *  111111 


VkM,,'Vl 


1=0 


,\ 


XN-k+£+l  2(rN-k+i+l) 


N-k+^+1  ^  N— k+^+1^ 


P(rN-k+il+l) 


“n-H+JI  R(rN-k+«,) 
+ 


\VVV 


(C.18.2) 


where 


V*,r) 


x  ^(r)  +  x  HT(r)  +  GT(r) 


(C.18.3) 


,L/P( 


From  (7.151)  we  see  that  V  (x„  ,  ,r„  ,  )  corresponds  to  the  problem  in 

N— K  N-K 

(C.18.2)  if  no  costs  are  incurred  after  time  (N-p) .  That  is,  if 

VM  (x  ,r  )  *  0.  Consequently 
N-p  N-p  N-p 


vt'p<Vk'Vk*j’  i  VkVk'W1’ 


(C. 18.4) 


for  each 


xN_Jt  6  3R  ,  at  each  j  6  n.  From  (7.152)  -  (7.153),  we  see 


that  V  'f (x„  ,  ,r„  ,  )  solves  the  problem  in  (C.18.2)  if  we  have  the 

N-K  N-K  N— K 

additional  constraints 


N-k+i 


l  =  p,p+l, . . . ,k 


Thus  at  each  time  (N-k) , 


Vk'Vk'Wi’  ‘ 


(C.18.5) 


for  all  (xN_^,rN_k) .  Combining  (C.18.1)  -  (C.18.5),  it  is  clear  that 


(7.150)  is  true  if 


VN-k(XN-k'rN-k)  -  VN-k(XN-k'rN-k) 


(C.18.6) 


for  all  (x^  . ,rN-k>  *  We  Can  veri^Y  (C.18.6)  by  noting  that 
V^_k  'Xjj_jc , r^j  .  )  is  the  cost  achieved  by  the  optimal  controller  for 


the  problem  (C.18.7)  (for  (N-k)  <  (N-p))s 

P-1 


Vk'Vk'Vk1  a  min 


VkM"'Vk+P-i 


XN-k+l+l2 (rN-k+i+l} 


^-k+H+l  S(rN-k+«.+l) 


P(rN-k+il+l) 


lm°  UN-k+£  R(rN-k+£) 


VN(XN-k+p'  rN-k+p^ 


(C.18.7) 


Comparing  (C.18.7)  with  the  problem  (7.152)  that  V  'mx  ,r  v) 


solves,  we  see  that  (C.18.6)  is  true.  Thus  Proposition  7.16(2)  holds; 


Proposition  7.16(3)  follows  immediately. 


D.  APPENDICES  TO  PART  IV 


D.l  One-Step  Solution  Equations  (for  Proposition  8.1) 


For  t  =  1,2 ,  —  '^+i  let 


JL31  be  the  index  of  A..  (*)  valid  in  (8.3)  when 
t  D  ^ 


V  e  'Sc.i 


<t) 


531  be  the  index  of  the  piece  of  \+1  ^xk+i,tk+i=^ 


valid  when  x^+1  6  A£+1(t), 


Vi¬ 


be  The  index  of  the  x-cost 


^  (xk+l,rk+l=‘i)  Valid  When 


Xk+1  e  \+l (t) 


and  in  proof  step  1  of  Proposition  8.1. 


Define  the  conditional  cost  parameters  in  i8.39)  by 
M 


*2+i<«  -  i  Si  ,Kk*i  (5r!i)  +  =xi>j 

x=l  J 


M 


■  l  Si «*"»  *l1’"  + 


i*l  3 
M 


<£♦!<'>  -  J  \  (cJSi)  *  P1!?;1)] 


i*l 


(D.1.1) 


(D.l. 2) 


(D.l. 3) 


Suppose  that  b ( j )  /  0.  Then  let 


j  i  .  vJ  n  4.  -  (jl  ^  zai ,  .  idmz  ,„,i 


°i  "W  ■  ‘W11  !1  ♦ 


1  *  Htjf  Vl  K.x' 


ejd, 


.  b  ( j )  K 

Yk+l(1)  tl+  R(j) 


'  (j)^l(1)  j  +  b2(j)  "j  ( 
R(j)  J  2 R t j )  Hc+l^1 


(D.1.4) 


(D.1.5) 


For  t  =  2,...,^+1  -  1 


’  rff; 

b  (3) 


#  (D.1.6) 


then  let 


ejit) 


®k!t> 


0  ' 

.  b^(j)  K?  ,  (t)  .2,., 

11  +  —  m f  > +  sHr 


tl  * 


b2(j)  ^+1(t> 


]  +  ^  (t) 

1  2R(j)  K+l  lC 


(D. 1.7) 


(D.1.8) 


For  t  =  2,...,  f  if 


*k+l(t> 


-R(i) 
b2  (j) 


(D. 1.9) 


then  let 

j 


[Yk+l(t)  +Yk+l(t"1)]  [R(j)  *b2(3)  Kj+i(t>l  +  b2 ( j)«^+i (t) 


For  t  =  2, . . .  ,^+1-l  ,  if 

(D. 1.10) 

(D. 1.11) 

b  (j) 

•then  let  2 

®k(t»  -  «i“>  ■  . 

(D. 1.12) 

Note  that  (D.1.12)  is  consistent  with  (D.1.7),  (D.1.8)  and  (D.1.10)  when 
(D.1.11)  holds. 

The  candidate  costs-to-go  in  (8.44)  and  corresponding  optimal 
control  laws  in  (8.45),  and  the  optimal  S 

by  these  controls  are: 


A^+1(t)  values  achieved 


v£'L(x]c,j)  -  Sj(t-l)  +  Gjj(t-1  ,t)  (D.1.13) 

uk,L(xk'j)  =  *k  +  (t-1)  (D.1.14) 

xfciJ<V31  ■  (D.1.15) 

for  t  =  2,3,...,^+1  if 
a(j)  \  -  9k(t) 

and 

V^'^x^j)  *  \  Hj^(t)  +  G^(t,t)  (D.1.16) 

u^'Nx^j)  *  ^  +  F^(t)  (D.  1.17) 

-  w*>  ro-i-i8) 


for  t  =  1,2, ...  Jp -  1  if 
®J(t)  1  a(j)  ^  . 

For  t  »  1,  t  ,  and  for  t  *  2,...,\p^+^  -  1  if  (D.1.6) 


holds,  we  have 


V  (Vj)  =  xk  Kic(t)  +  *k  Hic(t)  +  Gic(t) 


(D.1.19) 


\'U‘Vj)  ’  -Lklt)  *k  *  4(t) 


(D.1.20) 


*k+l(V3>  +  [a ( j )  -  b(j)  lJO:)]^  +  b(j)  pj(t) 


(D.1.21) 


9j(t)  <  a(j)  xk  <  0j(t)  . 


In  (D.1.13)  -  (D.1.14)  and  (D.1.16)  -  (D.1.17)  we  have 


xj  _  a  (j)  R ( j ) 

He  .  2  ,  . 

b  (3) 


(D.1.22) 


-4(t, . 

1>  (j) 


for  t 


.A PL,  -1  (D.1.23) 


8k«*'t»  - 


siU(t>  *  ^u(s>  ®i+i(ti 


b  (3 ) 


(D.1.24) 


defined  for 


s- 4  for  4  * 1 . *iUi  - 1 


and  s  =  t-1  for  t  ■  2,..,fy 


k+1  ' 


L3  =  a  ( j )  /b  ( 3 ) 


(0.1.25) 


(t)  =  (t)/b(j) 


(D. 1.26) 


In  (D.1.19)  -  (D.1.21)  we  have 

K^(t)  =  KT(j) 
a2 (j)  R(j)  (t) 

K?(t)  =  - 5 - ^4 -  (D.1.27) 

R(j)  +  b^(j)  ^  (t) 


H^(t)  =  HT(j) 
a(j)  R(j)  Sj  (t) 

H?(t)  =  -  -  (D.1.28) 

R(j>  +  tT(j)  Kj  <t) 


Gk(t>  - 


ss(t)  *  V3’ 

h2(jl  ti^lt))2 
4[R(j!  +  b2(j)  K^tt)] 


and 


(D.1.29) 


(D.1.30) 


a(j)  b ( j )  K^+1(t) 
R(j)  +  b2(j)  K^+1(t) 


(D.1.31) 


-b<3> 

2[R(j)  +  b2  ( j)  £jj+1<t>] 


(D.1.32) 


The  values  of  m^j),  C<S^  (f)  s  t*l, . . .  ( j)  -  l}  ,  K^(tsj), 

H^(t:j),  G^(t: j) ,  L^Ctsj)  and  F^(t:j)  are  assigned,  for  each  j  €  JA, 
by  performing  the  minimization  indicated  in  (8.49)  .  The  procedure 
for  doing  this  is  given  in  section  8.5.  The  derivation  of  (D.1.4)  - 
(D.1.32)  is  done  in  the  next  appendix  section. 


872 


If  b(j)  =0  then  the  optimal  control  is 


I 


lyx^r^j)  =  0  (D.1.33) 

with 

Vl  *  a<3)  \  (D.  1.34) 

and  cost 


VW3)  *  a2(j)  *k+iltl  \  *  a(3>  “Li11'  ** 

(t) 


(D.1.35) 


* 1 
+  GJ 
k+1 


where  the  index  t  is  determined  by  which  region  (t)  the 


value  is  in  (for  each  x^  value) . 

When  b(j)  -  0,  (D.1.33)  -  (D.1.35)  are  the  same  as 

(D.1.13)  -  (D.1.15)  with  9^(t)  and  0j^(t)  as  in  (D.1.4)  -  (D.1.8)  , 


3 


D.2  Derivation  of  (D.1.4)  -  (p.1.32) 


From  (8.42)  and  (D.1.1)  -  (D.1.3)  we  have  that 


Vk[xk'rk=j  ‘t]  =  10111 
"k 
s.t. 


Vi  +  9iLlt>  Vi>  ,D-2-1) 


6  A,. , ,  (t) 


*k+l  e  k+1 


A-t 

From  (8.1)  we  have  (for  b(j)  /  0)  that 


‘k+1  -  a(j)  *k 


b  ( j ) 


(b) j )  *  0  .  (D.2. 2) 


Thus  (D . 2 . 1 )  becomes 


Vk[xk,rk=j =  min 


Vi  e  4j+l(« 


b  (3) 


2a(j)  R( j)  x^ 

*k+l  EHk+l(t)  15  ] 


b2  ( j) 


a2(j)  R( j )  x2i 

CGk+l(t)  +— 2 - ‘I 


b  (j) 


(D.2. 3) 


Suppose  that  b ( j )  ?  0  and 


“VW3!*1 


(3W 


2[  Bli)_  .  Kjtl(t)J  >  0 


b2(j) 


(D.2. 4) 


Then  we  can  minimize  (D.2. 3)  by  differentiating  with  respect  to  x, 


and  setting  to  zero.  We  find  that  the  optimal  Xj^  is  then 


2a(j)  R ( j )  xk  -  b2(j)  H^+1(t) 


k+1  2 [R( j )  +  b2(j)  K^+1(t)] 


(D.2.5) 


if  this 


is,  in  fact,  in 


A*u  “>  ■ 


For  t  =  1  and  t  =  'fk+^,  (D.2.4)  is  always  true.  For  t=l, 


6  A^+1  (1)  if  and  only  if 


a(j)xk  < 


Yj+l(t)  (S(j)  +  b2(j)  Kj^lt)] 


k+1 


R(j) 


(D.2.6) 


We  define  the  right  side  of  (D.2.6)  to  be  0k(l) ,  as  in  (D.1.5). 


For  C  *^+1'  Vl  e  i£  “d  onlF  i£ 


Yk+l(t"1)  [R(j)  +  b2(j)  ^k+l(t)1 


R(j> 


-  a(j)  *k 

(D.2.7) 


We  define  the  left  side  of  (D.2.7)  to  be  •  as  in  (D.1.4). 

For  t  =  2,...,^k+1~l  with  (D.2.4)  holding,  6  ^J+^(t)  if  and 


only  if 


Yk+l(t) CR(j)  +  b  (j)*k+l(t)] 


Yk+1(t“D  CR(j)  +  b  (j)^k+i(t)] 


b  ( i ) 

V1 


<  a(j)  x. 


R(j) 


R(j) 


(E 


The  left  and  right  sides  of  (D.2.8)  are  defined  to  be  8^(t)  and 

0^(t)  respectively,  as  in  (D.1.7)-(0.I^D.2.  )  yields  (D.1.19)  -  (D.l. 
and  (D. 1.27)  -  (D.l. 32). 


Now  if  b ( j )  j*  0  and  (D.2.4)  holds  but 
a(j)xk  £  ej(t) 

(D.2.4)  implies  that  the  best  we  can  do  is  to  drive  x.  to 

K+l 

Yk+j^t-l),  the  left  boundary  of  A^+1(t).  Thus  from  (D.2.2) 
t  L  a<j>  *>  +  Yjl+1  (t'1) 

« ' lVI  ’  -  -  -  - 


b(j) 


(D 


which  yields  (D.l. 13)  -  (D.l. 15)  with  (D.l. 22)  -  (D.l. 26). 
Similarly,  if 

afjjx^  £  ©j(t) 

the  best  we  can  do  is  drive  xk+1  to  Y^+1(t),  the  right  boundary  of 
(t) .  We  then  obtain 

+  Yj  (t) 

-  (D 


wt'R(Xk+l,JV) 


b  ( j) 


.2.8) 


21) 


.2.9a) 


.2.56) 


which  yields  (D.l. 16)  -  (D.l. 18)  with  (D.l. 22)  -  (D.l. 26). 


If  b(j)  /  0  and  we  have 


a  Wyj|tl  < 

then  the  optimal  ^k+1  is  at  one  of  the  boundaries  of  Aj^+i(t) 

.+ 

x^+1  to  the  left  boundary,  Yj^^t-l),  if 


(D.2.11) 


We  drive 


'IYk+x(t,)2  &!<« 


iwt) 


%♦!<« 


3k+lltl 


(p .  2 . 1 0 


fw*-11  -alilxk 

\  b(j> 


)2  XI)  (- 


W*1  ■a(3,Xk  \2 


and  to  the  right  boundary, (t) ,  otherwise.  We  cam  rewrite  (D.2.11) 


a(j)x.  <_  [b2  (j)  K?  .  (t)  +  R( j)  1  CY^+1  (t)  ♦  Yj+1(tj-l)l  +  b2  (j)  Sj+1(t) 


2 

/  -4  \  uJ 


(D.2.12) 


The  right  side  of  (D.2.11)  is  defined  to  be  9^(t)  *  Q^(t)  when  b(j)  f  0  and 


(D.2.10)  holds,  as  in  (D.1.10). 


If  b(j)  ^0  and  we  have 


a  VWj|t] 
<6W2 


=  2  777  +  ^(t)  =0 

b  (3) 


(D.2.13) 


then  for  each  x^  value,  the  quantity  that  is  to  be  minimized  in 
(D:2.3)  is  a  linear  function  of  x^,\Aien  we  have 


Ixk,rk")l'9  -  fij  ,  (t)  -  -2a<3’  R(j)  *  >  0 
ECT  l5'''1  b2d) 


(D.2.14) 


then  the  best  x^+1  value  in  A^+1(t)  is  the  left  boundary,  Y^+1(t-l), 

since  the  cost  to  be  minimized  in  (D.2.3)  increases  with  x^+^ 

(for  fixed  x^) .  When  we  have 


*Vk 

5vT  tx*'r*=i 


-jlcl  =S3  (t)  -  »(i>  \<o 

b  (3) 


(D.2.15) 


the  best  xk+1  G  A^+1(t)  is  Y^+1(t).  For 

b  (3) 

any  e  A^+^(t)  yields  the  same  result  in  (D.2.3)  (for  fixed  x^) 

From  (D.2.13)  -  (D.2.16)  we  thus  get  (D.1.11)  -  D.1.12). 


(D.2.16) 


D.3  Proof  of  Proposition  8.2: 


We  first  note  that  relationships  (C.3.1)  -  (C.3.3)  of 
Lemma  C.3.1  hold  when  both 

*L(t>  >  -  -v,  <D-3-» 

b  (3) 

(J)  >  (D.3. 2) 

b2(j)  » 

Since  the  9's  and  0's  are  defined  by  (D.1.4)  -  (D.1.8)  (which 
are  the  same  as  (C.1.4)  -  (C.1.6)). 

r^aj)  is  continuous  at 
Y^+^(t)  and  that  (D.3.1)  -  (D.3.2)  hold  for  t  and  1=  t+1.  By 
continuity 

Vk'R(xk'j  }  "  Vk+1'L(VJ  5  (D.3. 3) 

since  we  are  driving  to  the  same  x^+1  value  in  each,  with  the 
same  cost.  Hence  from  (D.1.4) 


Let  us  assume  that  V. 


k+1 


‘Vi1 


G^(t,t)  =  Gj[(t,t+1)  . 


(D. 3.4) 


Suppose  that  we  also  have 


k+l 


'k+l  (xk+llrk“j) 


k+l  ' 1 k+l 


(D.3. 5) 


That  is 


2  Yk+i(t) 


2  "4+i<,:)  yLi(t) 


- 


(D. 3.6) 


Then  by  Lemma  C.3.1  we  have 


e^(t+i)  <_  ej(t) 


(D.3.7) 


Now  (D -3.3)  and  (D.3.7)  are  together  sufficient  to  guarantee  that 
neither  )  nor  (x. /j  )  can  be  optimal  for  any  x^. 


since 


<’*  <vj)  ±v£'u 


k  (vj) 


Vk+1,L(xk'?>-  Vk+1,U(xk,j) 

for  all  x^.  Thus  for  each  v^^(t)  at  which  Vk+^(X.^|rk=j)  is 
continuous,  with  (D.3.1)  -  (C  .2)  holding  for  t  and  l  *  t+1  and 
with  (D.3.6)  holding: 


min  {yJ['W(*k»jJ  '  Vk,r'(xk'j)f  VkTA,1"(xk'j)  '  Vk+1,°  (xk'j)  * 


Vk'U(xk'^ 


°r  ’T'V” 


(D.3.8) 


for  each 


This  verifies  (i)  of  Proposition  8.2. 


Suppose  that  (D.3.1)  -  (D.3.2)  hold  for  t  and  l  *  t+1  and 
Vk+1  (xk+1|rk=j)  is  continuous  at  y-j  (t)  but 


ix,  ^k+l(xk+llrk"^  > 


*4+1  ^k+l(Xk 


rk+l=j) 


.  (D.3.9) 


W  ('£+ilt)1 


That  is. 


,  (D.3.10) 


Then  by  Lemma  C.3.1  we  have 


0^ (t+1)  >  ©£(t) 


(D.3.11) 


From  (D.3.3),  (D.3.11)  we  have  that 


,R  .  _  t+1  ,L , 

\  ‘’S'31  =  \  ‘v3’ 


may  be  optimal  for 


ej(t)  <  a(j)  ^  <©jj(t+l) 


Hence  we  have  (8.53)  in  (i)  of  Proposition  8.2 


<  rr^ 

b  (3) 


(D.3.12) 


9k(t)  =  0k(t)  ' 


(D.3.13) 
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xs  given  by  (D1.10).  Here 


,t+l,U 


is  never  valid 


hence  (ii)  of  Proposition  8.2 


A  , 

Now  suppose  that  (x^+^ | r^= j )  is  not  continuous  at 

(t) .  Suppose  that 

w[-4+1“>n«  <  wfTjU‘«i+u> 


(D.3.14) 


That  is,  ^k+i^xk+^\r]C=^  has  a  discontinuous  increase  at  y^+1(t). 

This  can  happen  only  if  yj^Ct)  is  a  f°*m  transition  probability 
i  6  {1,. . .  ,\7  ,  or  an  x-cost  discontinuity  y^Cn), 

n  6  {1, . . .  ,y^-l}  ,  for  some  i  6  C  .. 

Then  clearly 


Tft  f  R  .  •  •  .  1 1 L  .  . . 

\  {xk'3)  <  vk  (V3) 


(D.3.15) 


t+1  ,L 


3o  in  this  case  '  (x^/j)  cannot  be  optimal  for  any  x^.  However, 
t  R 

/  '  (x^/j)  be.  Similarly,  if 


Vk+1( tYk+l{t) ]  >  Vk+l(CYk+l(t)] 


CD. 3. 16) 


(which  implies  that  yj^  is  a  form  transition  probability  discontinuity 
or  x-cost  discontinuity,  then 


.t+1  ,L 


<Vj) 


'k  <Vj) 


(D. 3. 17) 


hence  V^'  (x  ,j)  cannot  be  optimal.  Thus  we  need  consider  only 
the  candidate  costs-to-go  listed  in  the  statement  of  Proposition  8.2.0 


,Wwn  ■  ■ 


I 

S.*  % 


v  •. 


kT.V 

L-  * »- 


D.4  JLPC  One-Step  Solution  Details  (for  Proposition  9.1) 

In  this  section  we  provide  details  for  the  computations  in  steps 
3  and  4  of  the  constructive  proof  of  Proposition  9.1. 


Obtaining  the  z^+1  grid  (in  step  3) : 


It  is  straightforward  to  verify  that  for  each  s  =  l,...,r,  and 
t  =  l,...,ij^+^  in  (9.101)  we  have 

min  [O  (s)  ,  max  [Yj^+^  (t)  *zbil  ,0  (s-])]  ]  ■  max  [a  (s-1)  ,  min[Y^1  (t)  -  zUj_1,a(s)]] 


k+1 


k+1 


=  ^  (s,t) , 


k+1' 

(D.4.1) 


where  the  numerical  value  of  each  integration  limit  (s,t)  depends  upon 


it 


z,  . ,  as  follows  s 
k+1 


L?  (s,t) 


/  cr  (s) 

if 

Vi  i  A3(s'ti  - 

-  CT(s) 

|  (t>  - 

zk+l  if 

(s,t)  <_  zk+1  £  (s,t) 

(D.4. 2) 

\  a(s-l) 

if 

a(s-l) 

of  (A^ (s,t) , 

S3  (**!,+):  s  = 

t=l . 1'k+1-l} 

♦ 

in  (0.4.2)  comprise  a  tentative  partition  of  z^+^.  Given  the  grid  points 
(yk+l (t) )  {o (s) }  we  obtain  the  tentative  partition 

(Ml)  -  (y (1-1) ,  y(D)  1  -  \>)  , 

where  the  <p- 1  grid  points  are  distinct  elements  of  the  set 


<Yk+i<t>  -  0(s) 


t-1 . ’f'k.r1 1 


s  *  1, . . .  ,r  -  \ 


(D.4. 3) 
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and  are  ordered  as  follows: 


-  -  *  -  Y(0)  <  Y (1)  <  ...  <  Y(’l'-l)  <  Y 010  =  00  • 

/V  .  A  , 

To  obtain  the  z^+1  partition  (A^+^(t)  :  t=l, . . .  »^k+]}  of  (9.97)  -  (9.98) 
we  must  make  the  evaluation  indicated  in  (9.102),  and  add  extra  grid 
points  to  (D.4.3)  as  needed. 

Using  (D.4.1)  -  (D.4.2)  we  can  determine  the  limits  of  integration 
in. (9.101)  over  each  interval  in  (D.4.3).  We  can  then  evaluate 

A  — 

Vk+l(zk+J  r^  »j)  over  each  of  these  intervals  A (2.),  by  (9.101).  An 
efficient  way  to  carry  out  these  computations  is  to  do  them  for 

~  A 

z^+^  6  A(l),  and  then  to  successively  calculate  Vi^k+Jv^ 

r,  *j)  over  A (2)  by  adding  or  subtracting 
(as  appropriate)  those  integrals  in  (9.101)  whose  limits  change  when 
we  move  from  A  (2.)  to  AU+l). 


over  A(2+l)  from  vk+1(zk+1l 


That  is: 


A 

A  a* 

Compute  vk+1(zk+1lrk=3)  over  Ml).  By  (D.4.2)^  the 
integration  limits <  J(s,t)  are  all  equal  to  a(s).  Thus 
for  all  zk+^  €  A(l),  (9.101)  becomes 


a  ra(  s) 

l  J  ,  1"(v;s)vk+i(2k+i-H/;1)dv 

s=l  •'a(s-l)  # 

(D.4.4) 


k+l 


Zk+i6A(D 


) 


1. 


2.  Compute  vk+1  I rk” j  »  zk+1  6  AU+l))  from 

/v 

A 

Vk+l(zk+llrkx;5,  zk+l  6  as  follows: 


m  if  y^+^(H)  =  A^(s  ,t  )  for  some  s  ,t  in  (D.4.2), 

then  the  limit  L^(s  ,t  )  in  (D.4.1)  -  (D.4.2)  becomes 

1  *  * 

Y^+^(t  )  -  zk+]_  instead  of  a(s  );  consequently  we  add 

(  *) 

f  ^k+l^k+l4^^*^  *  Vk+l(zk+l+v;t*)]  dV 


Yk+l(t  )"Zk+l 


(D.4.5) 


t0  Vk+l(zk+llrk=j*  Zk+1  6  A(i) 


,  if  Y^+1(®>)  -  8d(s*,t*)  for  some  s*,t*  in  (D.4.2), 

4  £  £ 

then  the  limit  Lr (s  ,t  )  in  (D.4.1)  -  (D.4.2)  becomes 

A  4  ^ 

cj(s  -1)  instead  of  Y£+1(t  )  ~  zk+i»  consequently  we  add 
/*a(s  -1)  *  ^ 

J  «(vjs  >Evi+1(zk+1+v;t  )  -  Vk+1(zk+14v;t  +1)]J* 


Yk+l(t  )-Zk+l 


t0  ^^“k+l^k"^  zk+l  6  AW)’ 


(D.4.6) 


Adding  the  integrals  specified  by  (D.4.5)  -  (D.4.6)  to 

A 

W'k+JvJ-  =k+1  e  M«)  yields 

Vk+i ^ zk+i  I rk“3  »  zk+l  6  A^1J}*  This  is  done  sequentially 

A  **  **  1 

until  Vk+1(zk+1|rk-j,  zk+1  €  A(^))  is  obtained. 


If,  however,  the  JLPC  control  problem  is  completely  symmetric  about 
zero,  we  need  only  follow  this  procedure  for  z,  .  intervals  to  the 
left  of  zero. 
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«  > 


3.  Now  we  differentiate  each  Vk+1(zk+1  |rk=j >  z^+1  6  A(£)) 
twice  with  respect  to  2^+1*  °POP*  *K*  sec©r»4  derivative 


,we  break  up  A(£)  into  disjoint  intervals  such 
that  |r^=j)  is  convex  or  concave  over  each. 


Following  the  three  steps,  we  obtain  the  partition  and 

k+1(zk+1 [rk=j)  pieces  described  by  (9.97)  -  (9.98). 


Solving  the  constrained- in-zk+^  subproblems  (in  step  4) : 


The  subproblems  in  (9.103)  are  solved  as  follows: 


1.  If  9 ZVk(xk,rk-j |t) 


in  (9.111)  is 


(3W 


nonpositive  over  Aj^+^(t)  then  the  optimal  subproblem  cost 
Vk(xk>rk*j |t)  in  (9.103)  has  the  two-point  structure  of  (9.107) 
From  (9.108.)  -  (9.109),  the  joining  point  in  (9.107)  is  given 


0k(t)  *  QfcCO 


2  Yk+l^t_1^  +  ^k+l^ 


b2(j) 


Vk+l(zk+l;t) 


Zk+1=  Yk+l<:t"1) 


Vk+l(zk+l;t) 


WW*5 


RC3)  hj+1(t-l)  -^+1<t» 


(D.4.7) 
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2.  If  d  in  (9.111)  is  positive  over  (t) 

o  K*v  1 

<3W 

then  we  must  solve  (9.110)  to  obtain  the  z^+^  (as  a  function 
of  x^)  which  minimizes  (9.103).  It  is  the  optimal  z£^ 

(resulting  in  cost  V^’^x^.j)  in  (9.106))  for  those  x^ 

-'i 

values  such  that  this  z^+1  is  in  A£+i(t)  (if  any  such  x^ 
values  exist) . 


Solving  (9.110)  to  obtain  and  computing  the  joining 

points  9^(t),  0^(t)  in  (9.106)  may  be  quite  difficult  to  do  analytically 
(depending  upon  the  form  of  the  function  Vj£+^ (z^+1 » t) ) . 


D. 5  Proof  of  Proposition  9.2 


We  begin  by  verifying  4 (i)  -  (ii) .  We  are  considering  the 
solution  of  (9.113),  subject  to  (9.114).  The  cost  V^(x^,r^=j  t)  can 


be  written  as  a  function  of  Zj ,  as  in  (9.103).  To  find  the 


i 

optimal  in  Ak+^(t)  we  differentiate  (9.103)  twice  with 


respect  to  z  as  in  (9.110)  -  (9.111).  If  for  a  given  the 


first  derivative  is  zero  (i.e.  (9.110)  is  satisfied)  for  some 


★  * 
z.  =*  z  then  we  have  (9.126)  -  (9.128)  directly  (with  z  =  z  ). 

Kt’X  JC^X 


This  z  is  the  unconstrained  optimal  if  the  second  derivative  is 

* 

positive  (i.e.  (9.111)  is  satisfied)  and  if  z  is,  in  fact,  in 
Aj^+1(t).  Condition  (9.125)  results  in  the  satisfaction  of  (9.111). 


The  definitions  for  @k(t)  and  0^(t)  in  (9.129)  -  (9.130)  correspond 


t  u  * 

to  zk^x  (x^j)  =  z  of  (9.127)  inside  the  interval  A£+1(t).  Since 


we  have  chosen  the  zk+^  partition  so  that  (9.111)  is  satisfied 


A  *4 

throughout  A^+1(t)  or  not  a±  all,  the  "inactive  constraint" 


solution  in  (9*127)  is  unique.  To  verify  that 

9k-(t)  <  Qj^(t)  for  each  t=»l,...,^+1  in  (9.131)  we  note  that 


'k+l 


(t) 


b2(j) 

2R(j) 


3Vk+l(z?t) 


3z 


avjV,  (z ;  t) 


k+l 


dz 


z=[yj+1(t-i)]+ 


z*[y 


k+l 


(t) 
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Since  by  (9.111) 


3  Vk+l(ztt)  +  2±Jj)_  > 

3z2  b2 ( j) 


for  all  z  6  (yjj+1(t-l>,  yj+; 


(D.5.2) 


This  completes  verification  of  4(i)  -  (ii) . 


We  next  establish  items  (1)  -  (3)  of  Proposition  9.2.  Suppose 

A  ■ 

that  (9.111)  is  satisfied  for  all  z  6  A^+^(t)  (i.e.,  the  second 
derivative  is  positive) ,  but  that  for  a  given  x^  the  value  of 

A  . 

satisfying  (9.110)  is  less  than  Y^+^(t-l).  Then  then  best  (lowest 
cost)  in  A^+^(t)  is  °n  boundary  as  in  (9.116),  which 

is  obtained  with  control  (9.115)  and  results  in  cost  (9.117).  If, 
however,  the  zfc+1  value  satisfying  (9.110)  for  a  given  x^  is 
greater  than  Y^+1(t)»  then  (9.118)  -  (9.120)  apply.  Here  the 
values  of  93  (t)  and/or  0^(t)  are  given  by  (9.129)  -  (9.130) . 

Now  suppose  that  (9.111)  is  not  satisfied  as  in  (9.121).  Then  the 
solution  to  (9.110)  is  not  a  minimum?  the  only  choices  of  z^.^  in 

A  .  A  .  A  . 

A^+1(t)  are  the  boundaries  Y^+^(t-l)  and  Y^+1vt).  In  this  case 
the  values  of  9^(t)  and/or  0^(t)  in  (1)  -  (2)  are  specified  by 
the  intersection  of  V^'  (x^,r^*=j )  and  v^'  (x^,j).  This  yields 
(9.122)  -  (9.124)  directly.  Thus  we  have  established  (1)  -  (3) 
of  Proposition  9.2. 

Items  4(iii)  and  4(iv)  of  the  proposition  follow  immediately. 
Item  4(v)  is  a  direct  consequence  of  the  relationship 
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k  k 


=  a2(j) 


k+1  k+1 

(3zk+l} 


(D.5.3) 


For  t=l  and  t=  top  , 
k+1 


<  0  then  as  z.  ,  -*■  00 

1  k+1 1 


/s  . 

we  will  have  Vk+l^zk+l;t^  <  ^ violates  the  requirements 


on  Vk+l(zk+llrk=j) 


D.6  Proof  of  Proposition  9.3 


To  prove  this  proposition  we  first  establish  the  following 

A 

.  .  A 

relationship  between  9^(t),  Q^(t)  and  the  slopes  of  vk+1 (zk+1l rk"j> 
at  the  points  ^Yk+1(t)}  for  those  t  where  v£'  exists. 


Lemma  0.6.1 


For  t  and  l  such  that 


l3W2 


2  R(j) 


b2(j) 


(0.6.1) 


♦  >  o 


(D.6. 2) 


the  following  relationships  hold: 


1.  ej(t)  >  eJ(A) 


if  and  only  if 


3v^  (z  ;  t) 

k+1 '  k+1  1 


3z 


k+1 


3'W*wt> 


3z. 


Lmi 


k+1 


b2(j) 


zk+l3  k+l(t-1} 


zk+i“Y;+i(i-i) 


"^K*l lt’1) 
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2.  0^(t)  >  if  and  only  if 


3V,J. , ,  (z.  . , ;  t) 


k+1'  k+1 


3z 


k+1 


3V:>  (z 


k+1'  k+1 


3z. 


k+1 


>  2  R(j) 
b2(j) 


A  . 


k+1  'k+1 


a) 


\+iU) 


-Yk+i(t) 
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3.  0^(t)  >  0^(£.)  if  and  only  if 


<z_, ;  t) 


k+1'  k+1 


3z 


k+1 


3z. 


k+1 


>  Lmx 

b2  ( j) 


Zk+l*Yic+l(t) 


Zk+l^k+l(  -1} 


-wtJ 
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This  lemma  follows  directly  from  (9.129)  -  (9.130).  It  is  a 
generalization  of  Lemma  C.3.1. 


for  all  x^  (with  equality  only  at  ®k  ^ 

a(j) 


and  ^k^  ,  respectively^ 
a(j) 


Thus  for  each  =  Yj^+^(t)  at  which  (zVj_i  1  r^= j )  is  continuous 


k+1  k+1 1  k 


with  (D.6.1),  (D.6.2),  (D.6.7)  holding  for  t  and  l  =>  t+1; 


„t+l,L,  ..  „t+l,U, 

Vk  (V3)'Vk  (*k#3). 


v^U(Vj)  or 


,t+l,U 


(Vj) 


(D.6.9) 


for  each  x^.  This  verifies  (i)  of  Proposition  9.3. 


Suppose  that  (D.6.1)  -  (D.6.2)  hold  for  t  and)?--  t+1  and 

A 

A  A 

Vk+l(Zk+l!rk*j)  is  continuous  at  Yk+1.(t)  but  (D.6.7)  does  not  hold. 
Then  by  Lemma  D.6.1  we  have 


0j[(t+l)  >  0£(t) 


(D.6.10) 


From  (D.6.6) ,  (D.6.10)  we  have  that 


vk,R<v^  5  VT1,L^ 


may  be  optimal  for 


ej(t)  <  a ( j )  <  0-j(t+l) 


Therefore  we  have  verified  (i)  of  Proposition  9.3. 

Proposition  9.3(ii)  follows  directly  from  Proposition  9.2(3) 


Now  consider  Proposition  9.3(iii)  -  (iv)  :  if  V.  .  (z,  |r  =j)  is 

JC+  X  KtI  K 
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discontinuous  at  y^+^(t)  with  (9.141)  holding  then  clearly  (from 


(9.117),  (9.120))  we  have  vj'^x^j)  <  V^+1,L(xk,  j)  .  So  in  this 


t+1  ,L 


case  (x^,  j )  cannot  be  optimal  for  any  x^.  However, 

.t  ,R , 


(x^/j)  may  be.  This  verifies  Proposition  9.3(iii);  (iv)  follows 
analogously . 

Thus  we  need  only  consider  the  candidate  costs-to-go  listed 
in  the  statement  of  Proposition  9.3.  <D 


D.7  Proof  of  Proposition  9.6: 


1.  Differentiation  of  V^'L(x.  ,j)  in  (9.117)  and  V^'R(x^, j) 


in  (9.120)  with  respect  to  x^  yields  (9.144)  -  (9.145) 
for  any  x^  where  these  actively-constrained  costs  are 
optimal . 

t,U, 


For  any  x^  from  which  some  v^'  (x^, j)  is  optimal, 
Proposition  9.2(4)  and 


3V, 


,t,U 


3xk 


a  ( j) 


3z 


k+1 


'k+1 


k+1 


yields  (9.144)  -  (9.145). 


At  joining  points  6  *  x^  where  V^x^r^-j)  is  differentiable 
and  z]t+i^xk,rk“^  are  clearly  continuous, 


2. 


At  a  joining  point  =  6  where  the  slope  of 

r^=j)  decreases  discontinuously,  (9.144), 
(9.145)  yield  (i)  and  (ii)  directly. 


From  3(ii)  we  have  that  the  mapping 


Vi(Wj) 

increases  discontinuously  at  joining  points  where 


r^-j)  is  not  differentiable,  and  from  (2) ,  the 
mapping  is  continuous  at  other  joining  points. 

Now  between  joining  points,  if  the  optimal  cost 
corresponds  to  hedging-to-a-point  then  clearly  the 
mapping  is  constant.  If  the  optimal  cost  does  not 
correspond  to  hedging-to-a-point,  then  in  such  a 
region 


VWjl  ■  vk'U(Vj) 


for  some  t  Q  {l,...,  ^  }  .  Thus 


,  ,  H,  b2d>  jvvyi' 

2k-H(ltk'  k"J>  "  a  3  *k  2a(j)R(j)  3*. 


t(U,  ^  ...  b2(j)  avk+l(zk+l;t) 

Vl!VVJI  -  a<3>*k  -  2,ljl8(j)  - 5^  - 


t,U, 

k+1  k.l^k'31 


(D.7.X) 


From  (D .7.5)  and  (D . 7.3)  we  have 
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